Over-the-Air (OTA) firmware updates are table stakes for Internet of Things (IoT) devices. Once a device is in the field, OTA means you can make changes by updating the firmware remotely.

The ability to update devices remotely is great, but that alone is the bare minimum of functionality. Options for targeting specific devices for each firmware update are crucial to building a fleet that scales.

For example, at Golioth we use a small subset of our deployed fleet to test releases before rolling them out to all devices. In some cases we have different hardware in the same fleet that each need different binaries. And when adding new features we conditionally roll out release candidates to certain team members for testing.

All of these features are built into Golioth and ready for you to use. Recently one of our customers opened a forum thread asking how firmware version, device blueprints, and device tags work together to determine the OTA each device receives. It’s a great question that we’ll dive into today!

What are OTA Firmware Updates?

Over-the-Air (OTA) firmware updates are a method of using a network connection to send a device a new firmware version that it then validates and runs.

At Golioth, we’ve used multiple types of network connections to accomplish this, including cellular, WiFi, Ethernet, and Thread. Our Golioth Firmware SDK demonstrates the feature, using MCUboot to store two copies of firmware (the currently running version, and the newly received update). Each image is cryptographically signed so it can be verified for authenticity and integrity before the device uses the new image.

Golioth applies firmware updates on the device side using the semantic version number. The Golioth servers make these update versions available based on their sequence number. Let’s unpack the differences.

Demystifying Semantic Versions and Sequence Numbers

Remember this rule of thumb: devices will always match the semantic version of a firmware release, while the Golioth servers will always advertise the most recent sequence number.

Devices match semantic version

Devices have no awareness of “newer” or “older” firmware releases. They merely check if the version of firmware currently running is an exact match for the semantic version being advertised by the server. It they do not match, the device will download the binary and update itself to the version available from the server. This might be a newer semantic version, or an older one, the only thing that matters is that the device sees a different version is available.

Golioth OTA device serial output

Device serial output indicates version 1.0.100 is currently running. The manifest received from Golioth shows main-1.0.99 as the latest version. The device begins downloading. We call this a roll back, the device is only aware that the versions don’t match and the server is the source of truth.

This is crucial in delivering the ability to “roll back” a firmware update.  The Golioth web console has a rollout button that facilitates automatic roll back when you unselect the most recently uploaded firmware.

The server advertises the most recent sequence number

The Golioth server separates OTA into two distinct parts: the artifact and the release. The artifact is the binary itself which has a package name (the default name is main) and a semantic version number. The release is created using an existing artifact, adding a time-based sequence number (new releases have higher sequence numbers) and controlling whether or not the release is rolled out to devices.

Golioth OTA Artifacts

Each of the artifacts is a unique binary with its own package name and semantic version. This fleet uses all the same hardware so no Blueprint has been assigned to these artifacts

The Golioth servers will check to ensure artifacts have unique name & version combos — you can only upload main - 1.2.3 once. The next artifact will need a different version number or package name. (The only way around this is to delete the existing artifact so you may reuse the package/version number.)

Assigning a blueprint to an Artifact makes it unique. The rules from the previous paragraph still apply, but artifacts with a blueprint will only be compared to other artifacts with the same blueprint.

Finally, the file hash uploaded with each artifact must be unique from all other artifacts. The Golioth web console will check the file, and issue an error if the same binary is uploaded more than once. This is a safety feature to help ensure that the wrong artifact isn’t uploaded by mistake.

Golioth OTA Releases

Each release requires one artifact to be assigned. Notice that the main-1.0.1 artifact has been used multiple times, targeting devices with different tags assigned. The releases are made available to devices by enabling the Rollout toggle.

Golioth releases require an artifact, but the semantic version number will not be used to decide which release is advertised to devices. Instead, the release with the newest sequence number (meaning the most recently created release) will be advertised to devices.

There are many conditions to determining which release is advertised

It is important to understand that there are several ways to target devices with a release. The most obvious is the Rollout setting—a release will only be advertised if the rollout is enabled. Package names, device blueprints, and device tags are also used to determine which releases will be advertised to any given device in your fleet.

Applying Blueprints to Firmware Updates

If all devices in your fleet use the exact same hardware, they may all be able to run the same firmware. But in many cases, a fleet will have more than one hardware variant and need more than one compiled version of the same firmware release. For instance, if you have some devices that use an nRF9160 (cellular) and others that use an NXP i.MX RT1024 (Ethernet) you must run different firmware compiled specifically for those two distinct devices.

Golioth uses device blueprints to account for this issue. When you create devices on the Golioth cloud, you can choose one device blueprint to assign to the device. The same blueprint may be selected when uploading an artifact or creating a release.

When a device has a blueprint assigned, it will receive notification of releases with the newest sequence number and the matching blueprint.

Using Tags to Target OTA

Device tags can be assigned in a similar way to device blueprints but you may assign multiple tags (blueprints are limited to a single assignment per device/artifact/release).

Device tags must match exactly to receive a release notification. This means if you have a device with two tags and a release with only one tag, the device will not be notified. That said, if you roll out a release with zero tags selected, all devices in your fleet will be notified of the release no matter what tags are assigned to those devices.

Golioth Device Summary

This summary page for a single device in our fleet shows that this is a member of the “release-candidate” tag group. The device reports which version of firmware it is currently running, as shown in this view.

Multiple releases may be created using the same artifact. As an example, at Golioth we will roll out a release that targets the release candidate tag, so that devices that we have previously identified for testing new features will receive the update but the larger fleet will not. When testing is complete and the firmware artifact is determined to behave as we expected, a second release using the same artifact will be made without any tags, prompting the rest of the fleet to download and apply the OTA update.

How Does Golioth OTA Fit Your Needs?

Wow, this is a lot. Thanks for sticking with me through this post. I hope you will agree that there is quite a bit more to OTA than just providing an update file.

We’d love to hear your feedback. We’re especially interested to know your thoughts on how our system approaches tags. The last thing a fleet manager wants is a “surprise” update to a device they weren’t expecting. This is why we’ve gone to great lengths to implement granular control, and a multitude of options. But we’re always keen on using customer feedback to improve, so let us know on the Golioth Forum or hit us up at the DevRel email address.

Until next time, happy OTA updating!

This is an excerpt from our bi-monthly newsletter, which covers our recent news and happenings around the IoT ecosystem. You can sign up for future newsletters here.

When building out the Golioth platform, we’re constantly examining the user experience. We ask ourselves why it would make sense to use Golioth over alternative solutions, whether in house or off the shelf. While we want to have the most engineers using the Golioth platform as possible, we prioritize the real goal of making it easier to build secure, efficient, and useful products. We want our users to use the best tool for the job. Let’s dig a little deeper into how we do that.

When comparing Golioth to an in-house solution, we ask our customers a simple question: Is building your own IoT platform helping you differentiate your product? While some of you may answer “yes,” the majority of folks find that focusing on the IoT infrastructure is a distraction. For them, using Golioth helps drive down costs while increasing organizational efficiency. Those are tangible metrics for us to target; they are manifested in our pricing, our seamless device on-boarding, and straightforward user management.

When we compare Golioth to off-the-shelf solutions, our outlook is somewhat unique. Rather than trying to be the last software product you will ever need, we look to be the best platform for managing devices that connect to the internet. To do that, we build differentiated device management services, such as OTA updates, instant device settings, and real-time logging—to name a few—and we heavily optimize network throughput and efficiency. For simpler IoT products, we also provide application services such as LightDB State and LightDB Stream so that you can move beyond device management to basic data storage.

At Golioth, we care about moving the industry forward without forcing our users to compromise. The IoT product landscape is complex and heterogeneous. It would be naive of us to think that Golioth would be suitable for every aspect of any one product. That’s why we curate a large ecosystem of partnerships to enable you beyond the realm of device management. Our latest partnership announcement will enable devices to talk to Memfault over a single, secure connection. We aim for Golioth to integrate with best-of-class cloud platforms and enable their usage without complicating the firmware running on-device.

Crucially, the best tool for the job may look different today than it does tomorrow, in 6 months, and in 2 years. The promise of Golioth is that as long as your devices are sending data using our firmware SDK, you will have the flexibility to change where that data is going, how it is being processed, and how it ultimately is presented to the end-user, all without changing a line of firmware code. Golioth Output Streams currently enables this, but over the next few months, we will announce an even more robust set of features in this area, starting with our Memfault integration, which you can sign up to be notified about here.

At Golioth, we pride ourselves on doing both the deep technical work—as well as the routine maintenance—required to ensure we back up our claim of being the most efficient way that devices in the field can securely talk to services in the cloud.

In service of this mission, last week we turned on DTLS 1.2 Connection ID support in Golioth Cloud. While most users will not see any visible differences in how their devices connect to Golioth, behind the scenes many devices are now enabled to perform more efficiently. This is especially true for those connecting via cellular networks.

Devices using Connection ID can see a meaningful impact on bandwidth costs, power usage, and battery life. This not only benefits users of Golioth, but also means that more devices are able to operate for longer, reducing waste and energy use. Because we believe deeply in this potential widespread impact, we have been and will continue to do all work in this area in the open.

What are Connection IDs?

We have written previously about how we use CoAP over DTLS / UDP to establish secure connections between Golioth and devices. Because UDP is a connection-less protocol, attributes of each datagram that arrive at a local endpoint must be used to identify the remote endpoint on the other side.

In short, DTLS uses the IP address and port of the remote endpoint to identify connections. This is a reasonable mechanism, but some devices change IP address and port frequently, resulting in the need to constantly perform handshakes. In some scenarios, a full handshake may be required just to send a single packet. Performing these handshakes can have a negative impact on battery life, drive up bandwidth costs, and increase communication latency. For some devices, this cost makes the end product at best more expensive, and at worst, infeasible.

Connection IDs provide an alternative solution, allowing for clients and servers to negotiate an identifier that can be used in lieu of the IP address and port. Once negotiated, the endpoint(s) that are sending a Connection ID — typically just the client but sometimes both the client and the server — are still able to have their DTLS connection state associated with incoming records when their IP address and port changes. The result is that a device could travel from one side of the world to the other, continuing to communicate with the same server, while only performing the single initial handshake.

How was Connection ID support implemented?

Efficient communication between devices and the cloud requires compatible software on both sides of the connection.

Cloud Changes

On the cloud side, we are beneficiaries of the work done by folks in the Pion community, which includes a set of libraries for real-time web-based communication. Conventional use of these libraries is enabling video streaming applications on the internet, such as Twitch. However, the protocols they implement are useful in many constrained environments where the network is lossy or unreliable.

The Golioth team contributed Connection ID support in the pion/dtls library. This consisted of both the implementation of Connection ID extension handling and modifications to UDP datagram routing. The former involved careful updates to the parsing of DTLS records; Connection IDs change record headers from being fixed length to variable length. As part of this work, we are also adding a very small new API in the Golang crypto library.

Previously, pion/dtls utilized an underlying net.Listener, which was conventionally supplied by the pion/transport library. This UDP net.Listener handed net.Conn connections to the pion/dtls listener, which would in turn establish a DTLS connection by performing a handshake with the client on the other side of the connection. However, the net.Conn interface does not allow for consumers to change the remote IP address and port to which packets are sent. When not using Connection IDs, this is not an issue because the IP address and port are what identifies the connection. However, when Connection IDs are in use, the ID itself is used to identify the connection, and the remote IP address and port may change over time. Thus, a new interface, net.PacketListener, was added to pion/dtls, which enables the changing of the remote address of a connection, and an implementation of the interface that routes based on Connection IDs when present was supplied.

Device changes

On the device side, most users leverage the Golioth Firmware SDK to communicate with Golioth services, such as OTA, Settings, Stream, and more. The SDK is meant to work with any hardware, which is why we integrate with platforms such as Zephyr, Espressif ESP-IDF, Infineon Modus Toolbox, and Linux. Many of these platforms utilize DTLS support offered by mbedTLS, which added support for the IETF draft of Connection IDs in 2019, then included official support in the 3.3.0 release in 2022. The SDK uses libcoap, which implements CoAP support on top of a variety of DTLS implementations, including mbedTLS. libcoap started consuming mbedTLS’s Connection ID API in July of this year. We have been assisting in ensuring that new versions of libcoap are able to build on the platforms with which we integrate.

However, these platforms frequently maintain their own forks of dependencies in order to integrate with the rest of their ecosystem. We have both been contributing and supporting others contributions wherever possible in order to expedite the use of Connection IDs in the Golioth Firmware SDK. With Connection ID support already in place in ESP-IDF and Nordic’s sdk-nrf, and coming soon in the next Zephyr release, we hope to turn them on by default for all platforms in upcoming Golioth Firmware SDK releases.

How do I use Connection IDs?

Using the Golioth Firmware SDK is the only requirement to utilize Connection IDs. As support is added across all embedded platforms, update to the latest version in your device side code and your connection will automatically be enabled. The video at the top of this post shows how you can inspect your own device traffic to see the functionality in action.

What comes next?

In total, we have made contributions across many open source projects as part of our effort to make Connection IDs widely available, and we look forward to continuing to lend a hand wherever possible. While this post has provided a brief overview of DTLS, Connection IDs, and the ecosystem of libraries we integrate with, ongoing use of the functionality will allow us to provide tangible measures of its impact. We’ll make sure to make announcements as support continues to be added across platforms and consumed in the Golioth Firmware SDK.

Until then, go create a Golioth account and start connecting your devices to the cloud!

 

As embedded developers, we’re consistently seeking ways to make our processes more efficient and our teams more collaborative. The magic ingredient? DevOps.Its origins stem from the need to break down silos between development (Dev) and operations (Ops) teams. DevOps fosters greater collaboration and introduces innovative processes and tools to deliver high-quality software and products more efficiently.

In this talk, we will explore how we can bring the benefits of DevOps into the world of IoT. We will focus on using GitHub Actions for continuous integration and delivery (CI/CD) while also touching on how physical device operations like shipping & logistics which can be streamlined using a DevOps approach.

Understanding the importance of DevOps in IoT is crucial to unlocking efficiencies and streamlining processes across any organization that manages connected devices. This talk, originally given at the 2023 Embedded Online Conference (EOC), serves as one of the many specialized talks freely accessible on the EOC site.

GitHub Actions for IoT

To illustrate how to put these concepts into practice, we’re going to look at a demo using an ESP32 with a feather board and Grove sensors for air quality monitoring. It’s important to note that while we utilize GitHub Actions in this instance, other CI tools like Jenkins or CircleCI can also be effectively used in similar contexts based on your team’s needs and preferences.

For this example we use GitHub Actions to automate the build and deployment process.

The two main components of our GitHub Actions workflow are ‘build’ and ‘deploy’ jobs. The ‘build’ job uses the pre-built GitHub Action for ESP-IDF to compile our code, and is triggered when a new tag is pushed or when a pull request is made. The ‘deploy’ job installs the Golioth CLI, authenticates with Golioth, uploads our firmware artifact, and releases that artifact to a set of devices over-the-air (OTA).

Imagine an organization that manages a fleet of remote air quality monitors across multiple cities. This GitHub Actions workflow triggers the build and deployment process automatically when the development team integrates new features or bug fixes into the main branch and tags the version. The updated firmware is then released and deployed to all connected air quality monitors, regardless of their location, with no additional logistics or manual intervention required. This continuous integration and deployment allows the organization to respond rapidly to changes and ensures that the monitors always operate with the latest updates.

Let’s delve into the GitHub Actions workflow and walk through each stage:

  1. Trigger: The workflow is activated when a new tag is pushed or a pull request is created.
    on:
      push:
        # Publish semver tags as releases.
        tags: [ 'v*.*.*' ]
      pull_request:
        branches: [ main ]
  2. Build: The workflow checks out the repository, builds the firmware using the ESP-IDF GitHub Action, and stores the built firmware artifact.
    jobs:
      build:
        runs-on: ubuntu-latest
        steps:
        - name: Checkout repo
          uses: actions/checkout@v3
          with:
            submodules: 'recursive'
        - name: esp-idf build
          uses: espressif/esp-idf-ci-action@v1
          with:
            esp_idf_version: v4.4.4
            target: esp32
            path: './'
          env:
            WIFI_SSID: ${{ secrets.WIFI_SSID }}
            WIFI_PASS: ${{ secrets.WIFI_PASS }}
            PSK_ID: ${{ secrets.PSK_ID }}
            PSK: ${{ secrets.PSK }}
        - name: store built artifact
          uses: actions/upload-artifact@v3
          with:
            name: firmware.bin
            path: build/esp-air-quality-monitor.bin
  3. Deploy: The workflow installs the Golioth CLI, authenticates with Golioth, downloads the built firmware artifact, and uploads it to Golioth for OTA updates.
    jobs:
      build:
        runs-on: ubuntu-latest
        steps:
        - name: Checkout repo
          uses: actions/checkout@v3
          with:
            submodules: 'recursive'
        - name: esp-idf build
          uses: espressif/esp-idf-ci-action@v1
          with:
            esp_idf_version: v4.4.4
            target: esp32
            path: './'
          env:
            WIFI_SSID: ${{ secrets.WIFI_SSID }}
            WIFI_PASS: ${{ secrets.WIFI_PASS }}
            PSK_ID: ${{ secrets.PSK_ID }}
            PSK: ${{ secrets.PSK }}
        - name: store built artifact
          uses: actions/upload-artifact@v3
          with:
            name: firmware.bin
            path: build/esp-air-quality-monitor.bin

For those eager to dive in and start implementing DevOps into their own IoT development process, we’ve provided an example Github Actions workflow file on GitHub. Feel free to fork this repository and use it as a starting point for streamlining your own IoT firmware development process. Remember, the best way to learn is by doing. So, get your hands dirty, experiment, iterate, and innovate. If you ever need help or want to share your experiences, please reach out in our our community forum.

Today, Golioth is open-sourcing an early version of xk6-coap, a Grafana k6 extension that enables authoring load testing scenarios in Javascript that interact with Constrained Application Protocol (CoAP) endpoints. k6 is a load testing framework from Grafana Labs focused on extensibility, scalability, and developer experience.

The Importance of Load Testing

Load testing is the practice of simulating interaction with a system in order to observe how the system responds. For many applications, this manifests as orchestrating a large number of network connections to the system, each sending a pre-defined amount of data, or continuously sending data over some period of time. While simple load tests can be carried out by writing a minimal program that executes the specified behavior, it is frequently desirable to run the tests in a distributed fashion, either due to scalability requirements of the tests themselves, or to simulate geographically distributed interaction. As the number and complexity of tests increases, introducing a load testing framework can enable an organization to rapidly evolve the sophistication of their testing efforts.

Why Use k6?

When you operate a platform that needs to be able to accommodate millions of IoT devices, testing your system to ensure that it is able to respond appropriately to many different scenarios is critical. When we were thinking about how to reduce the friction of running these tests, k6 was a natural fit due to its use of Javascript as a scripting language, its built-in support for modifying the magnitude and duration of tests, and its ability to scale from a developer’s local machine to a distributed system like Kubernetes.

Fortunately, k6 also has a rich ecosystem of extensions, along with straightforward machinery to build your own.

Why a CoAP Extension?

Many of the devices connecting to Golioth are communicating over constrained networks. As described in a previous post, CoAP is a flexible protocol that allows these devices to operate efficiently in a wide range of environments. However, while well-supported by embedded RTOS’s, server-side support is somewhat more limited. As we strive to build scalable, resilient systems, we hope to also help grow the ecosystem of tooling by contributing back to the open source community.

Getting Involved

We have decided to open source xk6-coap while it is still in active development in order to allow for the community to influence the direction of the project, and contribute if so inclined. To start running your own CoAP load tests, check out the getting started steps in the README.md. If interested in contributing, take a look at the open issues and the CONTRIBUTING.md.

In the last post in our series on the Constrained Application Protocol (CoAP), we explored some of the trade-offs between reliable and unreliable data transmission. We also covered why the CoAP’s flexibility makes it a good choice for Golioth and constrained applications in general. In this post, we’ll dig deeper into the lifecycle of a CoAP message. What has to happen for data to get from a device in the field to a service in the cloud? What about back again? As we traverse that path, we will describe each protocol along the way.

Setting Up

Throughout this post, we will use the Linux port of the Golioth Firmware SDK at v0.7.0 to communicate from a local machine to Golioth. Specifically, we’ll be examining traffic from the Golioth Basics sample application, which currently uses pre-shared keys (PSKs) for authentication. This is a simpler model than those used by most constrained devices when communicating to Golioth, but it is useful for detailing the general sequence of operations.

We are also going to cheat a bit in this post by describing only communication from OSI Layer 3 and up, but I promise I’ll make it up to you in a painfully lengthy post about framing and modulation in the future.

We’ll use Wireshark to observe network traffic. If you want to view the decrypted CoAP payloads captured by Wireshark, you’ll need to go to Edit > Preferences > Protocols > DTLS then add your PSK in hex form. This can be obtained with the following command.

$ echo -n "YOUR_PSK_HERE" | xxd -ps -c 32

If you would like to follow along with the exact packet capture that we use in this post, you can find it, along with instructions for how to import into Wireshark, in the hasheddan/coap-pcap repository.

Life Before CoAP

When interacting with Golioth via the firmware SDK, it appears as though communication begins when the first CoAP message is sent. However, a number of steps are required before two endpoints can communicate over a Layer 7 protocol like CoAP.

Address Resolution

Because humans need to interact with and write programs that interact with the internet, we need to be able to specify the host with a “friendly” name. These names are referred to as domain names on the internet. For example, to read this blog post, you navigated to blog.golioth.io. However, your computer doesn’t know how to find this blog using that name, so it instead needs to translate the name to an address. The common analogy here would be telling my maps app that I want to go to my local coffee shop, Press, to get a crepe. The app needs to translate Press to an address of a physical location before using GPS to navigate there.

The Global Positioning System (GPS) is a protocol that we are not going to talk about at length in this post, but it is equally fascinating as those that we are.

This translation step not only allows us to use more friendly names when we talk about a destination, it also allows that destination to change its physical location without all other services needing to change how they find it. On the internet, the protocol that enables this translation is the Domain Name System (DNS).

Description of the Domain Name System (DNS) protocol with PDU structure.

The Golioth firmware SDK defines the URI of the Golioth server with the CONFIG_GOLIOTH_COAP_HOST_URI config value.

#ifndef CONFIG_GOLIOTH_COAP_HOST_URI
#define CONFIG_GOLIOTH_COAP_HOST_URI "coaps://coap.golioth.io"
#endif

Source

This value is parsed into the coap.golioth.io domain name when the client is created and a session is established.

// Split URI for host
coap_uri_t host_uri = {};
int uri_status = coap_split_uri(
        (const uint8_t*)CONFIG_GOLIOTH_COAP_HOST_URI,
        strlen(CONFIG_GOLIOTH_COAP_HOST_URI),
        &host_uri);
if (uri_status < 0) {
    GLTH_LOGE(TAG, "CoAP host URI invalid: %s", CONFIG_GOLIOTH_COAP_HOST_URI);
    return GOLIOTH_ERR_INVALID_FORMAT;
}

// Get destination address of host
coap_address_t dst_addr = {};
GOLIOTH_STATUS_RETURN_IF_ERROR(get_coap_dst_address(&host_uri, &dst_addr));

Source

The lookup is ultimately performed by a call to getaddrinfo.

struct addrinfo hints = {
        .ai_socktype = SOCK_DGRAM,
        .ai_family = AF_UNSPEC,
};
struct addrinfo* ainfo = NULL;
const char* hostname = (const char*)host_uri->host.s;
int error = getaddrinfo(hostname, NULL, &hints, &ainfo);

Source

While DNS, like many Layer 7 protocols, can be used over a variety of underlying transports, it typically uses UDP.

Description of the User Datagram Protocol (UDP) with a diagram.

Any DNS messages we send are encapsulated in the payload of a UDP datagram, which is supplemented with

  • Source port
  • Destination port
  • Length
  • Checksum

This information is in the header. The ports inform what service at the destination the data should be routed to. DNS uses port 53, so the destination port in our UDP datagram header should be 53. However, we still haven’t specified which resolver we want to send the query to. In this case, we need to know the physical address because we can’t ask the resolver to resolve addresses for us if we can’t resolve its address in the first place.

On the internet, these addresses are known as Internet Protocol (IP) addresses.

Card describing the Internet Protocol (IP) with a diagram. The resolver chain is highly dependent on the configuration of the system in use. My local Ubuntu system uses a local resolver, systemd-resolve, which is listening on port 53.

$ sudo netstat -ulpn | grep "127.0.0.53:53"
udp        0      0 127.0.0.53:53           0.0.0.0:*                           827/systemd-resolve

The first packets we see in Wireshark after running the Golioth basics example correspond to attempted DNS resolution of coap.golioth.io.Messages in a DNS recursive lookup.

The first two requests are to systemd-resolve, one for the IPv4 record (A) and one for the IPv6 record (AAAA). systemd-resolve subsequently makes the same requests from my local machine (192.168.1.26) to the router on my home network (192.168.1.1). The response from the router is then returned to systemd-resolve, which returns the answer to our program. Breaking apart the first query message, we can see our three layers of encapsulation.

The answer in the second to last packet for the coap.golioth.io IPv4 address contains the expected IP address, as confirmed by a simple dig query.

$ dig +noall +answer coap.golioth.io
coap.golioth.io.	246	IN	A	34.135.90.112

Establishing a Secure Channel

At Golioth, we believe that all data transmission should be secure. In fact, we go so far as to not allow for sending data to the Golioth platform unless it is over a secure channel. Your browser hopefully shows a lock symbol to the left of your address bar right now. That indicates that your request for this web page, and the content of the page that was sent back by the server, happened over a secure channel. This secure channel was established using Transport Layer Security (TLS). However, TLS was designed to run on TCP, and thus requires the presence of a reliable transport, which UDP does not provide. In order to enable the same security over UDP, Datagram Transport Layer Security (DTLS) was developed.

Description of the Datagram Transport Level Security (DTLS) protocol with a diagram.

Some of the differences that DTLS introduces to TLS include:

  • The inclusion of an explicit Sequence Number – DTLS must provide a facility for reordering records as UDP does not do so automatically.
  • The addition of retransmission timers – DTLS must be able to retransmit data in the event that it never arrives at its destination.
  • Constraints around fragmentation – While multiple DTLS records may be placed in a single datagram, a single record may not be fragmented across multiple datagrams.
  • Removal of stream-based ciphers – TLS 1.2 used RC4 as its stream-based cipher, which does not allow for random access and thus cannot be utilized over an unreliable transport.
  • Guardrails against denial-of-service attacks – Datagram protocols are highly susceptible to denial of service attacks due to the fact that a connection does not need to be established prior to sending data. DTLS implements a stateless cookie to help guard against this threat.

The DTLS handshake protocol consists of a sequence of records being sent between the client and server. The number and type of records depends on the DTLS implementation and configuration on each side. Returning to our Wireshark capture, we can see the exact sequence used in the Golioth basics example.

Let’s explore each step.

Client Hello

The Client Hello message is how a client initiates a DTLS handshake with a server. The content type of the record is set to Handshake (22) to indicate we are using the handshake protocol, and the embedded handshake structure is included as the fragment. We immediately see a few fields that are present in the DTLS specification that are not in TLS. Namely, the Epoch and Sequence Number in the record layer structure, and the Message Sequence, Fragment Offset , and Fragment Length fields in the handshake structure. These are all introduced to accommodate for the fact that we are transmitting over UDP, rather than TCP.

The message additionally includes information about how the client wishes to communicate, such as the DTLS Version, supported Cipher Suites, and supported Extensions.

Hello Verify Request

The Hello Verify Request is a new handshake type introduced in DTLS, which includes a stateless cookie and that is added to guard against denial-of-service concerns.

From the DTLS v1.2 RFC:

This mechanism forces the attacker/client to be able to receive the cookie, which makes DoS attacks with spoofed IP addresses difficult. This mechanism does not provide any defense against DoS attacks mounted from valid IP addresses.

Though servers are not required to send a Hello Verify Request, if they do, the client is required to send the Client Hello message again with the cookie included. We can see this behavior in the subsequent message.

Server Hello, Server Hello Done

The next packet is interesting because the UDP datagram contains two DTLS records, which is explicitly allowed by the RFC:

Multiple DTLS records may be placed in a single datagram. They are simply encoded consecutively. The DTLS record framing is sufficient to determine the boundaries. Note, however, that the first byte of the datagram payload must be the beginning of a record. Records may not span datagrams.

The Server Hello is a response to the Client Hello that indicates which of the functionality supported by the client should be used. For example, the server will select a Cipher Suite that is supported by both sides.

The Server Hello Done message indicates that the server is done sending messages. In this case we are using PSKs, so handshake messages for Server Certificate, Server Key Exchange, and Certificate Request are not required. However, in cases where they are, the client knows the server has sent all messages by the receiving of the Server Hello Done.

Client Key Exchange, Change Cipher Spec, Finished

The next packet includes three DTLS records. In the Client Key Exchange, the client informs the server which PSK will be used by providing a PSK ID.

The Change Cipher Spec message tells the server “we’ve negotiated parameters for communication, and I am going to use it starting with my next message”. That next message is the Finished record, which includes Verify Data encrypted using the negotiated TLS_PSK_WITH_AES_128_GCM_SHA256 scheme.

Change Cipher Spec, Finished

Finally, the server responds with its own Change Cipher Spec and Finished messages for the client to verify.

With a secure channel in place, we are now ready to send CoAP messages!

Sending a CoAP Message

Sending a CoAP message is not so different than sending DTLS handshake protocol messages. However, instead of Content Type: Handshake (22), we’ll be sending Application Data (23). The first message we see in the Wireshark capture is the log we emit while setting up the client in the Golioth basics program.

GLTH_LOGI(TAG, "Waiting for connection to Golioth...");

This capture shows the entire encapsulation chain: a DTLS record in a UDP datagram in an IP packet. Within the encrypted DTLS record payload, which we are able to inspect after supplying our PSK in Wireshark, we can see the content of a CoAP message.

Description of the Constrained Application Protocol (CoAP) with a diagram.

Messages are not to be confused with requests and responses. CoAP maps its request / response model to the Hypertext Transfer Protocol (HTTP), with requests providing a Method Code and responses providing a Response Code. The response to a Confirmable request message may be included in the corresponding Acknowledgement if it is immediately available. This is referred to as piggybacking. The message Token (T) is used to correlate a response to a request, meaning that when piggybacking is employed, both the request and response have the same Message ID and Token.

The log message shown above is specified as Confirmable with Message ID: 43156 and Token: 7d5b825d. The method code specifies that the request is a POST (2) , while the options specify the logs URI path and that the payload is JSON data.

Golioth should respond to this message with an Acknowledgement with a corresponding Message ID.

Not only does it do so, but it also employs piggybacking to supply a response Code (2.03 Valid (67)) and a Token matching that of the request.

How Does It Stack Up?

Though we steered clear of the data link and physical layers in this post, there is a whole world of hidden complexity yet to be uncovered beneath IP packets. The tooling and processes utilized in this post will enable both the exploration of those lower level protocols, as well as comparison between CoAP / DTLS / UDP / IP with other L3-L7 stacks. Check back for upcoming posts evaluating the similarities and differences with MQTT, QUIC, and more!

While you wait, create an account on Golioth and get your devices talking to the cloud! And if you feel so inclined, break out Wireshark and see what else you can learn.

TL;DR: Golioth uses CoAP because we care about performance, acknowledge that IoT architectures are heterogeneous, and believe that the definition of performance depends on the specifics of the given architecture.

If you’ve ever used the Golioth platform, or have even just toured through our documentation, you’ve likely encountered some mention of the Constrained Application Protocol (CoAP). While perhaps less familiar than other application layer protocols, such as HTTP, CoAP has established a foothold in the internet of things (IoT) space due to design decisions that allow its usage in highly constrained environments.

CoAP takes a unique approach that makes it well-suited for a world in which data flows from tiny devices, to data centers, and back again. It’s certainly not the only option for a protocol designed for performance. Today we’re going to do an overview of the problem space, a high-level description of CoAP, and a preview of where we are going. In the future, we’ll explore why we have standardized on CoAP at Golioth by demonstrating its impact in real environments.

Abstract diagram of devices navigating obstacles on their way to Golioth.

IoT devices frequently operate under a variety of constraints that drastically increase the complexity of reliably and securely getting data to a final destination.

What Do You Mean by “Constrained”?

Describing a device, network, or environment as “constrained” tells you very little about its actual attributes. Constrained only means that there are some limitations. But limitations may be imposed on any number of vectors. Some of the most common we encounter with users of Golioth’s platform include:

  • Power
  • Compute
  • Memory
  • Network

Devices fall into two broad categories with respect to power: those with continuous access and those without. The latter group consists of a wide range of devices:

  • Some have access to power on an intermittent, but relatively frequent basis.
  • Some need to operate for years on a single battery.

Conserving power is crucial for devices across the spectrum, and at the extreme end, it may be the foremost priority.

Compute typically refers to the raw performance of one or more CPUs on a device, as measured by their clock speed. These speeds may be orders of magnitude slower than the CPUs in devices we use on a daily basis, such as the smartphone in your pocket. Compute power limitations restrict the types of computation a device can perform. As a result, device operations need to be highly optimized.

Memory impacts both the amount of data that can be processed on a single device, as well as the size of the firmware image that runs on the device. The former may be constrained by the amount of volatile memory (e.g. SRAM, DRAM), on the device, while the latter is primarily constrained by the non-volatile memory (e.g. flash).

🔥🔥🔥 A brief rant on ROM: Read-Only Memory (ROM) has become synonymous with non-volatile memory, which, unfortunately has very little to do with whether the memory is write-able or not. Even when we are being more specific, using terms like Electronically Erasable Programmable Read-Only Memory (EEPROM), the very name is a contradiction. How is it read-only if we can also write to it? What we really mean to say is that we can read and write to it but it doesn’t go away when we lose power, and it is typically quite slow. In this series, we’ll do our best to use terminology that is specific about the exact type of memory we are interacting with. 🔥🔥🔥

Network relates to both the physical capabilities of the device, and the environment in which it is deployed. A device may lack the necessary hardware to communicate on a given type of network due to size or cost constraints. Regardless of the innate capabilities of a device, it might have trouble accessing networks, or the network may be noisy or unreliable.

Compute, Power, Network, and Memory with double-sided arrows pointing between them.

Compute, Power, Memory, and Network capabilities are all related. Accommodating for constraints in one area may have negative impacts on others.

Some constraints can change, but changing one will frequently impact others. For example, increasing the clock speed of a CPU will likely cause it to consume more power. There are always non-technical constraints at play as well, such as the cost of the device. Users of Golioth’s platform have to balance the trade-offs and make compromises, but they shouldn’t need to worry about how Golioth fits into their system with the decisions they make today, or the ones they make tomorrow.

Improving Performance in a Constrained Environments

IoT applications typically are interested in data: collecting it, processing it, sending it, and ultimately using it to make some decision. The constraints placed on devices impacts how an organization may go about performing these operations, and there are a few options available to them.

In an environment where devices are constrained on multiple dimensions, it may make sense to collect less data. While it may be nice to collect a reading from a sensor every minute, it may be acceptable to do so every hour. This is a powerful option because it typically results in positive downstream effects throughout the system. For example, choosing to collect a reading every hour means that a device can conserve more power by going to sleep for longer periods of time. It also means that less data will be sent over the network, and performing retries will be less of a concern due to the decreased number of messages being delivered.

Another option is to change where data is being processed. For Devices with continuous access to power and large amounts of memory, but operating on a constrained network, it may be advantageous to clean and aggregate the data before attempting to deliver it to its final destination. Simple sampling of data can go a long way to reduce network data usage in more constrained devices.

The final option is to send less data. While the previous two solutions also have an impact in this domain, you can send less data while collecting the same amount and doing most of the processing at the final destination. One mechanism for doing so is lossless compression. Implementing compression may come at the cost of larger firmware images and more power usage, but could drastically reduce the bandwidth used by a given application.

For those of you saying “compression is a form of processing!” — touche. However, we will use “processing” to mean changing the data in an irreversible manner in this post, whereas “compression” will be used to mean changing the data in a way that all of it can be recovered.

However, network connections are not only about payloads. Different protocols incur varying amounts of overhead to establish a connection with certain properties. If reliable transmission is required, more data will be required. The same is true to establish a secure connection.

In the end, developing solutions is not so different from adjusting the constraints of a system. There are a web of complex trade-offs to navigate, all creating side-effects in other parts of the system.

Performance is Flexibility

With so many dimensions in play, performance does not mean one thing in the world of IoT.

Or any other world, for that matter. I don’t care what the README’s of all the “blazing fast” libraries say.

Rather, a solution that focuses on performance is one that fits the constraints of the domain. When a resource is abundant, it should be able to consume more of it in favor of consuming less of a scarce one. When scarcity changes, the solution should be able to adapt.

At Golioth, we not only have to ensure that our platform accommodates the needs of any single organization, but also many different organizations at once. This means integrating with countless devices, working in all types of environments, and being able to provide a consistent user experience across heterogeneous system architectures.

At the core of achieving this goal are protocols. IoT products are nothing without the ability to communicate, but they introduce some of the most difficult scenarios in which to do so. Some protocols are better fits than others given some set of constraints, but few protocols are able to adapt to many different scenarios. The Constrained Application Protocol (CoAP) is unique in its ability to provide optimal trade-offs in many different environments.

The Constrained Application Protocol

CoAP was developed to accommodate many of the scenarios already explored in this post. While built from the ground up for a specific use-case, most folks who have used the internet before will find its design familiar. This is primarily due to the broad adoption of Representational State Transfer (REST) in the design of Hypertext Transfer Protocol (HTTP) APIs. CoAP aims to provide a similar interface in environments where usage of a full HTTP networking stack would not be feasible.

At the core of the CoAP architecture is a simple concept: allowing for reliability to be implemented at the application layer. This decision opens up a world of possibilities for application developers, allowing them to choose the guarantees that are appropriate for their use-case. The first level in which this philosophy is reflected is the usage of the User Datagram Protocol (UDP) as the default transport layer.

UDP and the Transmission Control Protocol (TCP) are the primary Layer 4 protocols in the OSI model, on which many higher layer protocols are built. The foundational difference between UDP and TCP is that the latter guarantees reliable, in-order delivery of packets, while the former does not. While this a favorable attribute of TCP, it also comes at a cost. In order to provide these guarantees, there is additional overhead in the handshake that sets up the connection between two endpoints, and larger packet headers are required to supply metadata used to implement its reliability features.

TCP and UDP packet structures.

Packet structures for TCP and UDP.

However, saying that UDP is always more performant than TCP would be incorrect. In some cases, such as the frequent transmission of small payloads on an unconstrained network where ordering is critical, the overhead of establishing a reliable connection and the congestion control features that TCP implements may actually result in fewer round-trips and less data being sent over the wire than when using some protocols that layer on top of UDP. That is not to say that similar characteristics cannot be achieved with UDP — it just doesn’t provide them out of the box. This is where CoAP shines: it optionally provides some of the features of TCP, while empowering the application developer to choose when to employ them.

TCP and UDP packet structures with UDP data segment populated with CoAP message structure.

Packet structures for TCP and UDP, with the CoAP message displayed in the Data portion of the UDP packet. Notice the similarity to fields in the TCP packet.

This is accomplished through the use of Confirmable (CON) and Non-confirmable (NON) message types, and the usage of Message IDs. It is common for applications to use both reliable and unreliable transmission when sending messages over CoAP. For example, it may be critical that every message containing a reading from a certain sensor is delivered from a device to a data center. It also could be configured to tolerate dropping messages containing information about the battery power remaining on the device.

Even though CoAP is designed to layer on top of UDP by default, it is capable of being layered on top of other transports — even TCP itself. When the underlying transport provides reliability, the CoAP implementation can strip out extraneous metadata, such as the Message ID, and eliminate application level Acknowledgement (ACK) messages altogether. Additionally, because of the adoption of REST concepts, proxies that translate from CoAP to HTTP, and vice versa, can do so without maintaining additional state. These attributes make CoAP a protocol that is capable of effectively bridging between constrained and unconstrained environments.

“All Others Must Bring Data”

If you search for protocol comparisons on the internet, you are bound to find a slew of high-level overviews that fail to bring data to justify many of the claims made. At Golioth, we believe in validating our assumptions with real-world examples that map to actual use-cases. In this post we have only scratched the surface of how CoAP works. We hope to dive into more detail, exploring environments with each of the enumerated constraints and how Golioth’s platform behaves in them.

In the mean time, sign up for Golioth’s free Dev Tier today and try it out with your unique IoT application!

Learning how to use JSONata with Grafana delivers a huge boost to your ability to visualize data.

Golioth makes it easy to gather data from IoT sensors in the field and store them on the cloud. But of course once you have that data you probably want to graph it to show what’s going on. We like to use Grafana for this purpose. It makes beautiful interfaces and can grab Golioth data using our REST API.

Grafana fields to select JSONPath or JSONata

Grafana allows you to choose JSONPath or JSONata for each field that you use

Connecting the two is simple, but things can get hairy when trying to get Grafana to navigate the JSON structures that are returned. While JSONPath is the default setting for parsing Grafana queries, JSONata is also available and delivers some very interesting inline conditional options, and string concatenation. Let’s look at what it can do and how to use it.

Just JSONata for in-line logic

I was building a dashboard and wanted to display the current settings for a device. More specifically, I wanted to know what my “loop delay” was set to for a particular device.

I know how to call the REST API and query the settings I configured on my dashboard. However, Golioth settings can be set from the “project”, “blueprint”, or “device” level (see our introduction to the Settings Service for more information). If I filter for just one of these levels and the data isn’t present, it will cause the Grafana panel to throw an error. Our normal parsing tool (JSONPath) doesn’t provide the capabilities to differentiate between the returned data. Here’s what that data looks like:

{
  "list": [
    {
      "id": "6376a13a08be30b7e6d8669f",
      "key": "LOOP_DELAY_S",
      "value": 60,
      "dataType": "integer",
      "projectId": "636d7db608be30b7e6d865d3",
      "createdAt": "2022-11-17T21:01:46.176Z",
      "updatedAt": "2022-11-21T22:06:31.182Z"
    },
    {
      "id": "637bedbf8ad1a55c85e2894a",
      "key": "LIGHT_THRESH",
      "value": 1800,
      "dataType": "integer",
      "projectId": "636d7db608be30b7e6d865d3",
      "createdAt": "2022-11-21T21:29:35.321Z",
      "updatedAt": "2022-11-21T21:29:35.321Z"
    },
    {
      "id": "637bedcc8ad1a55c85e2894b",
      "key": "TEMP_THRESH",
      "value": 23.7,
      "dataType": "float",
      "projectId": "636d7db608be30b7e6d865d3",
      "createdAt": "2022-11-21T21:29:48.610Z",
      "updatedAt": "2022-11-21T21:29:48.610Z"
    },
    {
      "id": "637c27aa8ad1a55c85e28954",
      "key": "TEMP_THRESH",
      "value": 21.5,
      "dataType": "float",
      "projectId": "636d7db608be30b7e6d865d3",
      "deviceId": "636d7df208be30b7e6d865d6",
      "createdAt": "2022-11-22T01:36:42.195Z",
      "updatedAt": "2022-12-17T21:58:20.108Z"
    },
    {
      "id": "63978ef5e51fde6bbe11bfa6",
      "key": "LIGHT_THRESH",
      "value": 900,
      "dataType": "integer",
      "projectId": "636d7db608be30b7e6d865d3",
      "deviceId": "636d7df208be30b7e6d865d6",
      "createdAt": "2022-12-12T20:28:37.696Z",
      "updatedAt": "2022-12-12T20:28:37.696Z"
    }
  ],
  "total": 5
}

You have to look very closely at the data above to realize some entries have a deviceId field and others do not. If an entry has that key, we want to use the value to isolate just that piece of data. We also want to have a label that shows “Device” or “Project”. We can filter for this using JSONata:

list[key='LOOP_DELAY_S'][-1].deviceId ? "Device:" : list[key='LOOP_DELAY_S'].blueprintId ? "Blueprint:" : "Project:"

The code above has a few things going on:

  • First, filter all returned values for one key:
    • list[key='LOOP_DELAY_S']
  • Next, get the last value from that list:
    • [-1]
    • In our case we know the most specific settings value returned by Golioth will be the last value with that key
  • Now let’s do some inline conditional work:
    • Inline formatting works as follows: test ? use_this_if_true : use_this_if_false
    • Our example returns the string Device if it finds deviceId otherwise it tests for blueprintId and returns the string Blueprint, or finally defaults to return the string Project

The other thing we need is to write the value. But for human-readable values it’s best to return a unit as well. I used JSONata concatenation for this:

$join([$string(list[key='LOOP_DELAY_S'][-1].value), " s"])

This uses the same filtering and negative indexing tricks from above. But there are two new function calls, $string() converts the value to a string, and $join() concatenates the two strings together to add a space-S to the value.

Device settings panel in GrafanaThe panel looks great, and it’s all done with just two field settings in the Grafana:

JSONata panel settings

Just a peek at JSONata

This is just a small part of what JSONata has to offer. It has not replaced JSONpath as my go-to parsing language for Grafana, but I find it easier for filtering data, and it has inline conditionals and concatenation that I think are absent in JSONpath. Give it a spin the next time you’re stuck and see if it does the trick!

Golioth’s web console has a super easy and intuitive way to send Over-The-Air (OTA) updates to devices in the field. But for ultimate flexibility, you can choose to use the Golioth REST API for managing OTA firmware updates.

Every IoT device should have the ability to update firmware remotely. OTA is a crucial feature for fixing bugs and keeping fleet security up-to-date. But the device itself is only one half of the equation, you also need a management tool to get the new firmware to the device. Golioth is that management tool, and every one of our features are available using our REST API. Today we’ll take a look at how firmware update artifacts and releases work, using Python for the demonstration.

Overview of OTA from the Golioth Console

Before jumping into the REST API, let’s take a quick look at the Golioth Console OTA flow to help better understand the abstract concepts. There are two parts to rolling out an OTA update: artifacts and releases.

Golioth OTA artifact

The artifacts screen shown here facilitates the upload of the firmware binary that will be served to the devices in your fleet. During the upload process you assign a package name (usually main) and version number (e.g. 1.0.1), with an optional setting for Blueprint (a way of grouping device types on Golioth)

Golioth OTA release

Creating a release is the second part of creating an OTA update. You can select Blueprint and Device Tags to narrow down which devices are being targeted. The release is associated with an Artifact (uploaded in the previous step). Optional “release flags” are a way to associate some additional information about the release.

Releases include a rollout toggle button. This is a way to stage a release and devices will only be notified when the rollout toggle is turned on. If you ever need to go back to previous firmware version, this button serves as a one-click rollback!

Creating OTA Artifacts and Releases with the Golioth REST API

Now that we’ve reviewed Artifacts and Releases, let’s use the REST API to create some!

The Open API page from the Golioth Doc site

First off, the Open API section of our Docs is the best place to test out your REST API calls. Just click the Authorize button at the top and give it an API key from your Golioth Project. From there you can try out the commands live on that page.

Python API calls

import requests
import base64

For this demo I’ll use Python 3.10.6 to make the API calls. This is pretty easy to do using the requests and base64 packages.

Uploading an Artifact

def upload_artifact(api_key, proj_id, version, package, filename):
    artifact_upload_url = "https://api.golioth.io/v1/artifacts"
    headers = {'x-api-key': api_key}
    with open(filename, 'rb') as f:

        data = {
            "projectId": proj_id,
            "content": base64.standard_b64encode(f.read()).decode(),
            "version": version,
            #"blueprintId": "string",
            "package": package
            }

    jsonData = requests.post(artifact_upload_url, json=data, headers=headers).json()
    if 'data' in jsonData:
        if jsonData['data']['version'] == version:
            print("Artifact {}-{} created successfully for project {}".format(package, version, proj_id))
    elif 'code' in jsonData:
        print("Error code {}: {}".format(jsonData['code'], jsonData['message']))
    return jsonData

The artifact upload requires the API key and project name. The package name is usually main and semantic versioning is used for the version number. I have not assigned a Blueprint in this example but I did leave a commented line if you choose to do so.

>>> result = upload_artifact("V010sGuWtXXM1htPCHBjLGfrW6GlsKDt", "developer-training", "1.2.3", "main", "new_firmware.bin")
Artifact main-1.2.3 created successfully for project developer-training
>>> import json
>>> print(json.dumps(result, indent=4))
{
    "data": {
        "id": "63ceb040c345ce2e0256ac30",
        "version": "1.2.3",
        "package": "main",
        "createdAt": "2023-01-23T16:05:20.235Z",
        "updatedAt": "2023-01-23T16:05:20.235Z",
        "binaryInfo": {
            "digests": {
                "sha256": {
                    "digest": "0b101d7b0ad1330ec49471d6feb0debc02e022a99e839c7951a446c1539802e6",
                    "size": 32,
                    "type": "sha256"
                }
            },
            "headerSize": 0,
            "imageSize": 1219152,
            "tlvTotalSize": 0,
            "type": "default",
            "version": ""
        },
        "size": "1219152"
    }
}

A JSON packet is returned when the upload is successful. It includes all pertinent information for your newly created artifact. This includes the id of the artifact which is needed to create a release.

Creating a Release

def create_release(api_key, proj_id, artifact_id):
    release_url = "https://api.golioth.io/v1/projects/{}/releases".format(proj_id)
    headers = {'x-api-key': api_key}
    data = {
        #"deviceTagIds": [ tagId ],
        "artifactIds": [ artifact_id ],
        "rollout": True
        }

    jsonData = requests.post(release_url, json=data, headers=headers).json()
    if 'code' in jsonData:
        print("Error code {}: {}".format(jsonData['code'], jsonData['message']))
    return jsonData

The release is created using the API key, project name, and artifact ID. In this example I’ve chosen to rollout the release at the same time as it is created, so the devices will be notified immediately that there is a new OTA firmware version available.

>>> result = create_release("V010sGuWtXXM1htPCHBjLGfrW6GlsKDt", "developer-training", "63ceb31ac345ce2e0256ac31")
>>> import json
>>> print(json.dumps(result, indent=4))
{
    "data": {
        "id": "63ceb3b9c345ce2e0256ac32",
        "createdAt": "2023-01-23T16:20:09.563Z",
        "updatedAt": "2023-01-23T16:20:09.563Z",
        "releaseTags": [],
        "deviceTagIds": [],
        "suitManifest": {
            "authentication-wrapper": [
                {
                    "algorithm-id": "sha256",
                    "digest-bytes": "2a3efb45029dc5d23cf0adf5e260ba53c9a78266b9ee28bdbf4ef20b43a2d6c7"
                }
            ],
            "manifest": {
                "common": {
                    "common-sequence": [
                        {
                            "id": "set-component-index",
                            "value": 0
                        },
                        {
                            "arg": {
                                "class-id": "53f7395e-0825-5970-badb-cc7158e49eaa",
                                "image-digest": {
                                    "algorithm-id": "sha256",
                                    "digest-bytes": "0b101d7b0ad1330ec49471d6feb0debc02e022a99e839c7951a446c1539802e6"
                                },
                                "image-size": 1219152,
                                "vendor-id": "36323535-6565-3863-6430-346536653332"
                            },
                            "id": "override-parameters"
                        },
                        {
                            "id": "vendor-identifier",
                            "value": 15
                        },
                        {
                            "id": "class-identifier",
                            "value": 15
                        }
                    ],
                    "components": [
                        [
                            "[email protected]"
                        ]
                    ]
                },
                "install": [
                    {
                        "id": "set-component-index",
                        "value": 0
                    },
                    {
                        "arg": {
                            "uri": "/.u/c/[email protected]"
                        },
                        "id": "set-parameters"
                    },
                    {
                        "id": "fetch",
                        "value": 2
                    },
                    {
                        "id": "image-match",
                        "value": 15
                    }
                ],
                "manifest-sequence-number": 1674490809,
                "manifest-version": 1,
                "run": [
                    {
                        "id": "set-component-index",
                        "value": 0
                    },
                    {
                        "id": "run",
                        "value": 2
                    }
                ],
                "validate": [
                    {
                        "id": "set-component-index",
                        "value": 0
                    },
                    {
                        "id": "image-match",
                        "value": 15
                    }
                ]
            }
        },
        "artifactIds": [
            "63ceb31ac345ce2e0256ac31"
        ],
        "rollout": true,
        "sequenceNumber": "1674490809563381303"
    }
}

Upon success, all details of the release are returned by the REST API.

Get the Artifact ID (or Release ID)

def get_artifact_id(api_key, proj_id, version, package):
    artifact_url = "https://api.golioth.io/v1/projects/{}/artifacts".format(proj_id)
    headers = {'x-api-key': api_key}
    
    jsonData = requests.get(artifact_url, headers=headers).json()

    for i in jsonData['list']:
        a_ver = i['version']
        a_package = i['package']
        if version == a_ver and package == a_package:
            return i['id']
    return None

The artifact ID is returned when uploading a new binary. However, if you need to get it after the fact there is a REST API call for that as well.

>>> result = get_artifact_id("V010sGuWtXXM1htPCHBjLGfrW6GlsKDt", "developer-training", "1.2.3", "main")
>>> print(result)
63ceb31ac345ce2e0256ac31

Artifacts are identified by the version number and package name, but it’s the ID that is needed when creating a release using the REST API. The same approach is used to query the Release ID, just change the URL for the API call:

release_url = "https://api.golioth.io/v1/projects/{}/releases".format(proj_id)

Rollout, rollback

def release_set_rollout(api_key, proj_id, release_id, rollout_state):
    release_url = "https://api.golioth.io/v1/projects/{}/releases".format(proj_id)
    rollout_url = release_url + "/" + release_id
    headers = {'x-api-key': api_key}
    data = { "rollout": rollout_state }

    jsonData = requests.patch(rollout_url, json=data, headers=headers).json()

    return jsonData

Finally, you can rollout and rollback releases using the Release ID.

>>> result = release_set_rollout("V010sGuWtXXM1htPCHBjLGfrW6GlsKDt", "developer-training", "63ceb3b9c345ce2e0256ac32", False)
>>> print(result['data']['rollout'])
False

When rollout change is successful, you will receive the complete release manifest. Here I’ve printed just the state of the rollout.

IoT any way you need it

We’ve gone to great lengths to make it easy to build your IoT fleet. For most users this means a straightforward GUI experience on the web console. But every one of those features are available programmatically for those who need it. Today we showed OTA functionality, but you can just as easily create a custom UI for your fleet.

We’d love to hear about how are you using the REST API. Start a thread in the Golioth Forum to share your story!

Once your first device is online and talking back to the Cloud, your problems immediately start to multiply. As you go from one to ten to one hundred devices, you want to understand some basic information about all the devices in your fleet. This week we showcased how to visualize this data by creating a “Fleet View” in Grafana. Let’s look an how you can build one of your own.

Moving from a Device View to a Fleet View

The example in the video uses the devices in a project related to the Golioth IoT Trashcan Reference Design. In that setup, we have a single device view so that we can visualize the depth of data coming off of the device:

Fig 1. “Device View” Dashboard

In the fleet view we take that individual device data and stretch it horizontally across the page for each individual device. From there, replicate each of the chosen boxes to get the same view for every device in a fleet, in this case it’s 6 devices. The resulting dashboard looks like this:

Fig 2. “Fleet View” Dashboard

The ability to replicate the setup across all the device in a fleet is built upon two elements of Grafana: creating a variable and using the ‘repeat’ option for each visualization element.

Creating a variable

We have posted about creating Grafana dashboards in the past. The first thing you need to do is set up a “data source” to configure Grafana to talk to the Golioth REST API. This is how you will extract the data that your devices are sending back to the Golioth Cloud. Each device has a unique ID that Golioth uses to coordinate data moving through the Golioth platform.

When we want to visualize more than one device in our system, we need to make it possible for Grafana to dynamically switch between the different Golioth Unique Identifiers for all the devices on your project. The first step is to create a variable in Grafana that represents all of the devices.

Click on the gear icon to access the “Dashboard Settings Menu” on the upper right of your dashboard (hit ‘esc’ if you don’t see anything on the upper right of your dashboard, your menu may be hidden).

This will take you to the following screen, click on Point A to get into the variables menu and then click ‘+New Variable’ or click on the variable you want to modify at Point B.

Fig 3. Dashboard Settings Menu

In the variable creation/editing menu, you can set a name (Point 2 in the image below). You will need to select your JSON API data source that connects you to the Golioth REST API (Point 3).

Under the ‘Path’ menu, you will need to do a GET request at the /devices/ end point (not shown). Then replicate the variables shown (Point 4) to pull back data from your project that includes an array of all device IDs in a project ($..id / value) and the associated names of all devices in your project ($..name / text).

Finally you will want to toggle Multi-Select and Include All Option (Point 5). This will give your dashboard a pulldown menu, as is showcased in the video above.

 

Fig 4. Variable creation menu

Finally, you need to set these newly created fields in the Experimental tab (Point 6) and set the “Variable text” and “Variable value” (Point 7).

Fig. 5 Variable Experimental Tab

You now have a variable you can work with in your project.

Creating a repeating element on your Fleet Dashboard

Each individual box shown in Fig. 2 is a Grafana “panel”, which means you can connect it to a data source and do some kind of visualization with it. Normally this is a chart, a dial, a displayed value, or an icon/graphic you assign to a state.

When you edit one of these panels, you need to change two things to enable “fleet view”. The first is to change the target on the Path tab (Point C) to point to the /devices/${device}/stream endpoint on the REST API. This will automatically change which set of device data we’re looking at when we change the pulldown in the upper left corner. As shown, it is looking up the Golioth Device ID for “Trashcan A5” and inserting that into the path in place of ${device}. See below for the code I have in the “Body” field that you can use as a model when doing a POST request to the REST API.

Next we want to modify the “Repeat options” on the right sidebar (Point D). I selected that we repeat based on all of the devices we have selected. I also selected that the “Repeat direction” is vertical so it stacks up the repeated panels on top of one another to get a view like in Fig. 2. See this in action with the associated YouTube video.

Fig. 6 Edit Panel View

{
    "start" : "${__from:date}",
    "end" : "${__to:date}",
	"perPage" : 999,
	"query": {
		"fields" : [
			{ "path": "time" },
			{ "path": "deviceId"},
			{ "path": "*"}
		],
		"timeBucket" : "5s"
	}
}

Build your next fleet with Golioth

In addition to the flexibility that a Grafana dashboard can provide, the Golioth Console acts as a more text-focused fleet visualization and control plane for your devices out in the field. We think you can understand a good amount about your devices simply from glancing at your project Console and seeing the state of each device.

We always want to hear about how we can improve the experience for our users. Please try out the Console with our free Dev tier and then stop by our Forum to discuss your fleet deployment!