Firmware versioning is a crucial part of ensuring the binary you’re about to load onto your device is the correct arrangement of bits that will control the hardware how you want. It is fundamental to a system like Golioth’s Over-The-Air update service. Today we’re going to look at different ways that you can tell which version you’re running in Zephyr and how your firmware update is correct the first time and every time. We’ll use Golioth Reference Designs as an approximation of what a product design would do for the devices in their fleet.

How we generate firmware versions

Around these parts, we adhere to Semantic Versioning, or SemVer. It’s a useful way to assign incremental changes to a firmware image. It also matches how Zephyr increments versions, and how the Golioth SDK is versioned. So…it’s a good idea to have any binaries we create for our Reference Designs to also have a SemVer attached. As you can see in the video, we also able to detect versions and show interesting elements when the images adhere to SemVer. We expect our users to follow a similar format (especially if using MCUboot), though Golioth is flexible across different version methods.

For Reference Designs, the firmware version is reported through the bootloader; MCUboot in the case of our Zephyr Designs. That means when the image is being loaded by MCUboot, it is reading the version embedded in the image.

Previously, we needed to do this manually. At some point in the generation of a firmware binary, the firmware engineer needs to decide this is the version that will be released as 1.2.3 (or whatever number is next in your internal process). That’s where things can get tricky, because humans are often in the loop. Perhaps we have a commit in Git like 1a2b3c4dand we want to attach the version 1.2.3 to it. Well, we would do that on the command line… a place I notoriously mess things up! See more in the last section (historical footnote) for more info.

We changed how we do things in Reference Designs starting in v2.4.0 of the RD Template. We now use a controlled VERSION file, following the guidance of the Zephyr Application Version Management system. This has been working out great. We store the file in the repository and it gets incremented (and tracked!) like any other file.

Where can I see it?

OK, so we have the version number embedded into the binary, but how do I actually see it? We can do this in a couple different places: the log output, the MCUboot shell, all the Golioth console.

Application load and boot

When built on top of nRF Connect SDK and Zephyr, you can see logging output for the version in our main application code. This application also shows the firmware version of the modem and the attached Ostentus display firmware (when present).

The MCUboot shell

This shell is not enabled by default on the Reference Design Template, so I turned it on by adding CONFIG_MCUBOOT_SHELL=y to prj.conf in my file. This gives me a new shell menu that shows my regions. Super useful when trying to troubleshoot OTA issues.

On Golioth’s Devices -> Firmware Tab

Part of the OTA code in the Golioth SDK is that it reports back the current version and the current state of OTA, so you can see where you are in the midst of an update. This is a great place to see the last reported version. The same information is available on the Summary tab of the same page.

On Golioth’s Firmware Updates -> Cohorts Tab

This shows the history of packages, which will include the one currently running

On Golioth’s Firmware Updates -> Packages Tab

This is where you upload new images to be included in future deployments and where Golioth extracts the MCUboot version and uses it as another check against existing images on the server. We also point out when you’re trying to upload a similar image you have uploaded before.

Keeping your deployments organized

Attaching a SemVer to your binary won’t solve all your problems, but it will help you organize the range of firmware images you’re sending out to your devices. Golioth helps to check your versions on the cloud and seamlessly deliver the package to eligible devices. If you have questions about how to build or deploy your next firmware image, let us know in the forum!

 

Historical Footnote

When we used to have to call out the image version on the command line, it was a mess. I recall a scenario where I made a change to my firmware and re-used the last build command which also happened to have the version number attached. But that’s because we’re not actually using a controlled document to set the version. Sure, I might have a tag in GitHub that says 1.2.3, but there’s no mandate for the firmware to adhere to that so I needed to do that manually before. Note that there is a --which escapes the west build command and then the next DCONFIG command, which actually sent the version to CMake to include in the build.

west build -p -b nrf9160dk_nrf9160_ns samples/dfu -- -DCONFIG_MCUBOOT_IMAGE_VERSION=\"1.2.3\"

Note: the above code will not work for modern versions of Zephyr

Cohorts announcement with screenshot showing 3 cohorts dev, qa, production

Today, we’re excited to announce the launch of Cohorts, a significant enhancement to Golioth’s Over-the-Air (OTA) update system. This new feature takes device management to the next level, offering more control, better organization, and a safer experience when deploying updates to fleets of IoT devices at scale.

When we first introduced our OTA system, it allowed developers to release firmware updates to devices in the field using tags and blueprints. While this system offered flexibility and worked well for many, we’ve heard from users that, at scale, managing implicit groupings and targeting could become complex. This sometimes required extra attention to ensure updates were applied to the correct devices with the right version.

Cohorts addresses these issues by introducing explicit groupings, allowing you to organize devices into defined cohorts for predictable, traceable OTA updates.

Walk through of deploying firmware to a cohort of devices

What’s New with Cohorts?

  1. Explicit Device Grouping: Devices are added to cohorts based on tags or blueprints, which were previously used for targeting updates. Now, these tags and blueprints are used to form static cohorts, giving you clear, organized control over your OTA deployments.
  2. Safer User Experience: The new console UX reduces human error by showing exactly how many devices are being updated, what actions will occur, and what has happened. This added context ensures users know what will be affected, making updates more intuitive and less error-prone.

For full details on how Cohorts work, visit our documentation.

A New Way to Organize Artifacts: Packages

Alongside Cohorts, we’re introducing Packages to manage the files that make up your OTA updates. A package represents a single upgradeable component on your device, such as firmware, AI models, or other assets. Each package has multiple versions, with each version corresponding to an artifact—a binary file uploaded to Golioth.

All existing artifacts from the old system now appear within their respective packages, streamlining organization and management.

For more information, see the Managing Packages section in our documentation.

Free for Individual Developers, Upgraded Features for Teams

At Golioth, we’re committed to supporting individual developers by offering OTA updates for free. With the launch of Cohorts, developers on the Free Tier can create up to 3 cohorts at no cost—perfect for small-scale groupings like “Dev,” “QA,” and “Production” cohorts.

For teams and organizations, the Teams Tier includes up to 10 cohorts for $299 per month, giving more flexibility and control over device groupings and deployments. New and existing Enterprise customers can access unlimited cohorts, ensuring large-scale deployments can be managed with ease.

Transitioning from the Previous OTA System

Devices in the field using Golioth for OTA will continue to receive updates, as expected, with no disruption. To migrate to Cohorts, devices must be explicitly added to a Cohort. We recommend starting by moving a test device or two into a Cohort to ensure a smooth transition.

Once you’re comfortable with the new workflow, you can transition your entire fleet. For more details, visit our migration guide.

Learn More

Cohorts is available now for all Golioth users. To get started or learn more, visit our documentation or explore pricing at golioth.io/pricing.

We’re excited to see how you’ll use Cohorts to keep your devices up to date and secure. As always, if you have any questions or feedback, feel free to reach out to us at [email protected]. Stay tuned for more updates!

Golioth for A Cover

Today, we are thrilled to announce the launch of Golioth for AI, a comprehensive set of features designed to simplify and enhance the integration of AI into IoT products.

At Golioth, we envision a future where AI and IoT converge to create smarter, more efficient systems that can learn, adapt, and improve over time. The fusion of AI and IoT has the potential to unlock unprecedented levels of innovation and automation across various industries. However, integrating AI into IoT devices can be complex and challenging, requiring robust solutions for managing models, training data, and performing inference.

Today, at Golioth, we are addressing these challenges head-on. Our new set of features focuses on three core pillars: training data, model management, and inference. By streamlining these processes, we aim to empower developers and businesses to quickly add AI to their IoT projects, where it was not readily possible to do so before.

Training Data: Unlocking the Potential of IoT Data

At Golioth, we recognize that IoT devices generate rich, valuable data that can be used to train innovative AI models. However, this data is often inaccessible, in the wrong format, or difficult to stream to the cloud. We’re committed to helping teams extract this data and route it to the right destinations for training AI models that solve important physical world problems.

We’ve been building up to this with our launch of Pipelines, and new destinations and transformers have been added every week since. Learn more about Pipelines in our earlier announcement.

In v0.14.0 of our Firmware SDK, we added support for block-wise uploads. This new capability allows for streaming larger payloads, such as high-resolution images and audio, to the cloud. This unlocks incredible potential for new AI-enabled IoT applications, from connected cameras streaming images for security and quality inspection to audio-based applications for voice recognition and preventative maintenance for industrial machines.

For an example of uploading images or audio files to Golioth for training, see:

We’ve recently added three new object storage destinations for Pipelines:

These storage solutions are perfect for handling the rich media data essential for training AI models, ensuring your training set is always up to date with live data from in-field devices.

Partnership with Edge Impulse

Today, we’re also excited to announce our official partnership with Edge Impulse, a leading platform for developing and optimizing AI for the edge. This partnership allows streaming of IoT data from Golioth to Edge Impulse for advanced model training, fine-tuned for microcontroller class devices. Using Edge Impulse’s AWS S3 Data Acquisition, you can easily integrate with Golioth’s AWS S3 Pipelines destination by sharing a bucket for training data:

filter:
  path: "*"
  content_type: application/octet-stream
steps:
  - name: step-0
    destination:
      type: aws-s3
      version: v1
      parameters:
        name: my-bucket
        access_key: $AWS_ACCESS_KEY
        access_secret: $AWS_ACCESS_SECRET
        region: us-east-1

This streamlined approach enables you to train cutting-edge AI models efficiently, leveraging the power of both the Golioth and Edge Impulse platforms. For a full demonstration of using Golioth with Edge Impulse see: https://github.com/edgeimpulse/example-golioth

Model Management: Flexible OTA Updates

Deploying AI models to devices is crucial for keeping them updated with the latest capabilities and adapting to new use cases and edge cases as they arise. However, deploying a full OTA firmware update every time you need to update your AI model is inefficient and costly.

To address this, Golioth’s Over-the-Air (OTA) update system has been enhanced to support a broader range of artifact types, including AI models and media files.

Golioth Console Display OTA AI Model Releases

Our updated OTA capabilities ensure that your AI models can be deployed and updated independently of firmware updates, making the process more efficient and streamlined. This allows models to be updated without having to perform a complete firmware update, saving bandwidth, reducing battery consumption, and minimizing downtime while ensuring your devices always have the latest AI capabilities.

We’ve put together an example demonstrating deploying a TensorFlow Lite model here: Deploying TensorFlow Lite Model.

Flox Robotics is already leveraging our model management capabilities to deploy AI models that detect wildlife and ensure their safety. Their AI deterrent systems prevent wildlife from entering dangerous areas, significantly reducing harm and preserving ecosystems. Read the case study.

Inference: On-Device and in the Cloud

Inference is the core of AI applications, and with Golioth, you can now perform inference both on devices and in the cloud. On-device inference is often preferred for applications like real-time monitoring, autonomous systems, and scenarios where immediate decision-making is critical due to its lower latency, reduced bandwidth usage, and ability to operate offline.

However, sometimes inference in the cloud is ideal or necessary for tasks requiring significant processing power, such as high-resolution image analysis, complex pattern recognition, and large-scale data aggregation, leveraging more powerful computational resources and larger models.

Golioth Pipelines now supports inference transformers and destinations, integrating with leading AI platforms including Replicate, Hugging Face, OpenAI, and Anthropic. Using our new webhook transformer, you can leverage these platforms to perform inference within your pipelines. The results of the inference are then returned back to your pipeline to be routed to any destination. Learn more about our new webhook transformer.

Here is an example of how you can configure a pipeline to send audio samples captured on a device to the Hugging Face Serverless Inference API, leveraging a fine-tuned HuBERT model for emotion recognition. The inference results are forwarded as timeseries data to LightDB Stream.

filter:
  path: "/audio"
steps:
  - name: emotion-recognition
    transformer:
      type: webhook
      version: v1
      parameters:
        url: https://api-inference.huggingface.co/models/superb/hubert-large-superb-er
        headers:
          Authorization: $HUGGING_FACE_TOKEN
  - name: embed
    transformer:
      type: embed-in-json
      version: v1
      parameters:
        key: text
  - name: send-lightdb-stream
    destination:
      type: lightdb-stream
      version: v1

Golioth is continually releasing new examples to highlight the applications of AI on device and in the cloud. Here’s an example of uploading an image and configuring a pipeline to describe the image with OpenAI and send the transcription result to Slack:

filter:
  path: "/image"
steps:
  - name: jpeg
    transformer:
      type: change-content-type
      version: v1
      parameters:
        content_type: image/jpeg
  - name: url
    transformer:
      type: data-url
      version: v1
  - name: embed
    transformer:
      type: embed-in-json
      version: v1
      parameters:
        key: image
  - name: create-payload
    transformer:
      type: json-patch
      version: v1
      parameters:
        patch: |
          [
            {
              "op": "add",
              "path": "/model",
              "value": "gpt-4o-mini"
            },
            {
              "op": "add",
              "path": "/messages",
              "value": [
                {
                  "role": "user",
                  "content": [
                    {
                      "type": "text",
                      "text": "What's in this image?"
                    },
                    {
                      "type": "image_url",
                      "image_url": {
                        "url": ""
                      }
                    }
                  ]
                }
              ]
            },
            {
              "op": "move",
              "from": "/image",
              "path": "/messages/0/content/1/image_url/url"
            },
            {
              "op": "remove",
              "path": "/image"
            }
          ]
  - name: explain
    transformer:
      type: webhook
      version: v1
      parameters:
        url: https://api.openai.com/v1/chat/completions
        headers:
          Authorization: $OPENAI_TOKEN
  - name: parse-payload
    transformer:
      type: json-patch
      version: v1
      parameters:
        patch: |
          [
            {"op": "add", "path": "/text", "value": ""},
            {"op": "move", "from": "/choices/0/message/content", "path": "/text"}
          ]  
  - name: send-webhook
    destination:
      type: webhook
      version: v1
      parameters:
        url: $SLACK_WEBHOOK

For a full list of inference examples see:

Golioth for AI marks a major step forward in integrating AI with IoT. This powerful collection of features is the culmination of our relentless innovation in device management and data routing, now unlocking advanced AI capabilities like never before. Whether you’re an AI expert or just starting your AI journey, our platform provides the infrastructure seamlessly train, deploy, and manage AI models with IoT data.

We’ve assembled a set of exciting examples to showcase how these features work together, making it easier than ever to achieve advanced AI integration with IoT. We can’t wait to see the AI innovations you’ll create using Golioth.

For detailed information and to get started, visit our documentation and explore our examples on GitHub.

Thank you for joining us on this exciting journey. Stay tuned for more updates and build on!

Today, we are thrilled to announce the launch of Pipelines, a powerful new set of features that redefines how you manage and route your IoT data on Golioth. Pipelines is the successor to our now-deprecated Output Streams and represents a significant upgrade in our functionality, scalability and user control.

Two years ago, we introduced Output Streams to seamlessly connect IoT data to various cloud services like AWS SQS, Azure Event Hubs, and GCP PubSub. This enabled Golioth users to efficiently stream sensor data for real-time processing, analytics, and storage, integrating easily with their existing cloud infrastructure.

Since then, we’ve gathered extensive feedback, and now we’re excited to introduce a more versatile solution for data routing: Pipelines. Previously, all stream data had to flow into LightDB Stream and conform to its JSON formatting requirements, which was restrictive for those with regulatory and data residency needs. With Pipelines, you can direct your data to LightDB Stream, your own database, or any other destination, in any format you choose.

Pipelines also introduces filtering and transformation features that simplify or even eliminate backend services through low-code configuration. Configurations are stored and managed as simple YAML files, making them easily deployable across multiple projects without manual recreation. This approach allows you to version your data routing configurations alongside your application code.

Pipelines Example Screenshot

Internally, Pipelines architecture is designed to support our future growth, enabling us to scale our data routing capabilities to billions of devices. This robust foundation allows us to quickly iterate and add new features rapidly, ensuring that our users always have access to the most powerful and flexible data management tools available.

All Golioth users can start taking advantage of Pipelines today. Projects that were previously only using LightDB Stream and did not have any Output Streams configured have been automatically migrated to Pipelines. Users in those projects will see two pipelines present, which together replicate the previous behavior of streaming to LightDB Stream. These pipelines can be modified or deleted, and new pipelines can be added to support additional data routing use-cases.

Projects with Output Streams configured will continue using the legacy system, but can be seamlessly migrated to Pipelines with no interruptions to data streaming. To do so, users in those projects must opt-in to migration.

New projects created on Golioth will have a minimal default pipeline created that transforms CBOR data to JSON and delivers it to LightDB Stream. This pipeline is compatible with Golioth firmware examples and training, but may be modified or removed by a user if alternative routing behavior is desired.

Pipelines are especially advantageous for users with specific data compliance requirements and those transmitting sensitive information, such as medical device products. Removing the requirement of routing data through LightDB Stream, where it is persisted on the Golioth platform, provides two main benefits:

  1. Regulatory Compliance: Users can route data to their own compliant storage solutions, making it suitable for many sensitive applications that require data not be persisted on other third-party platforms.
  2. Cost Savings: For users who do not need data persistence, routing data directly to other destinations can avoid the costs associated with streaming data to LightDB Stream. This flexibility allows for more efficient and cost-effective data management.

Getting Started with Pipelines

Alongside the launch of Pipelines, we have also released a new version of the Golioth Firmware SDK, v0.13.0 , which introduces new functionality to support streaming arbitrary binary data to destinations that support it. Previously, only CBOR and JSON data could be streamed to Golioth, as everything flowed through LightDB Stream, which only accepts JSON data. Now, rather than streaming data to LightDB Stream, data is sent to the Golioth platform and routed to its ultimate destination via the pipelines configured in a project. Devices using previous versions of the Golioth Firmware SDK will continue working as expected.

Pipelines can be configured in the Golioth Console using YAML, which defines filters and steps within your pipeline. Here’s an example:

filter:
  path: "*"
  content_type: application/cbor
steps:
  - name: step-0
    destination:
      type: gcp-pubsub
      version: v1
      parameters:
        topic: projects/my-project/topics/my-topic
        service_account: $GCP_SERVICE_ACCOUNT
  - name: step-1
    transformer:
      type: cbor-to-json
      version: v1
  - name: step-2
    transformer:
      type: inject-path
      version: v1
    destination:
      type: lightdb-stream
      version: v1
  - name: step-3
    destination:
      type: influxdb
      version: v1
      parameters:
        url: https://us-east-1-1.aws.cloud2.influxdata.com
        token: $INFLUXDB_TOKEN
        bucket: device_data
        measurement: sensor_data

This pipeline accepts CBOR data, delivers it to GCP PubSub, before transforming it to JSON and delivering it to both LightDB Stream (with the path injected) and InfluxDB. This is accomplished via three core components of Pipelines.

Filters

Filters route all or a subset of data to a pipeline. Currently, data may be filtered based on path and content_type. If either is not supplied, data with any value for the attribute will be matched. In this example, CBOR data sent on any path will be matched to the pipeline.

filter:
  path: "*"
  content_type: application/cbor

Transformers

Transformers modify the structure of a data message payload as it passes through a pipeline. A single transformer may be specified per step, but multiple steps can be chained to perform a sequence of transformations. This transformer will convert data from CBOR to JSON, then pass it along to the next step.

- name: step-1
  transformer:
    type: cbor-to-json

Destinations

Destinations define where the transformed data should be sent. Each step in a pipeline can have its own destination, allowing for complex routing configurations. When a step includes a transformer and a destination, the transformed data is only delivered to the destination in that step. This destination sends JSON data to LightDB Stream after nesting the object using the message path. The next step receives the data as it was prior to the path injection.

- name: step-2
    transformer:
      type: inject-path
    destination:
      type: lightdb-stream
      version: v1

The full list of Output Stream destinations is now available as Pipelines destinations (with more to come):

  • Azure Event Hub
  • AWS SQS
  • Datacake
  • GCP PubSub
  • InfluxDB
  • MongoDB Time Series
  • Ubidots
  • Webhooks

For detailed documentation visit, visit our Pipelines Documentation.

Updated Pricing Model

We’re keeping the same usage-based pricing as Output Streams but also introducing volume discounts. We want to emphasize transparent pricing optimized for MCUs and are revising the pricing structure for Pipelines to accommodate a wider range of data usage patterns, ensuring affordability and predictability in billing for both low and high bandwidth use cases while ensuring customers with large fleets of devices can enjoy discounts that come with scale.

Data routed to External Pipelines Destination
Data Volume (per Month) Per MB Price ($)
0 – 1 GB $0.40
1 – 10 GB $0.34
10 – 50 GB $0.28
50 – 150 GB $0.22
150 – 300 GB $0.16
300 – 600 GB $0.10
600 – 1 TB $0.04
1 TB+ $0.01
Data routed to LightDB Stream
Data Volume (per Month) Per MB Price ($)
0-1 TB + $0.001

The first 3MB of usage for Pipelines is free, allowing users who are prototyping to do so without needing to provide a credit card. This includes usage routing data to LightDB Stream through Pipelines.

For full details, visit Golioth Pricing.


Pipelines marks a significant step forward in Golioth’s IoT data routing capability, offering a new level of flexibility and control. We’re excited to see how you’ll use Pipelines to enhance your IoT projects. For more details and to get started, visit our Pipelines Documentation.

With our new infrastructure, we can rapidly add new destinations and transformations, so please let us know any you might use. Additionally, we’d love to hear about any logic you’re currently performing in backend services that we can help you streamline or even delete. If you have any questions or need assistance, don’t hesitate to reach out. Contact us at [email protected] or post in our community forum. We’re here to help!

Migration from Output Streams to Pipelines

As previously mentioned, projects currently using Output Streams will continue to leverage the legacy infrastructure until users opt-in to migration to Pipelines. We encourage users to try out Pipelines in a new project and opt-in existing projects when ready. Output Streams does not currently have an end of life date but we will be announcing one soon.

Stay tuned for more updates and happy streaming!

Over-the-Air (OTA) firmware updates are table stakes for Internet of Things (IoT) devices. Once a device is in the field, OTA means you can make changes by updating the firmware remotely.

The ability to update devices remotely is great, but that alone is the bare minimum of functionality. Options for targeting specific devices for each firmware update are crucial to building a fleet that scales.

For example, at Golioth we use a small subset of our deployed fleet to test releases before rolling them out to all devices. In some cases we have different hardware in the same fleet that each need different binaries. And when adding new features we conditionally roll out release candidates to certain team members for testing.

All of these features are built into Golioth and ready for you to use. Recently one of our customers opened a forum thread asking how firmware version, device blueprints, and device tags work together to determine the OTA each device receives. It’s a great question that we’ll dive into today!

What are OTA Firmware Updates?

Over-the-Air (OTA) firmware updates are a method of using a network connection to send a device a new firmware version that it then validates and runs.

At Golioth, we’ve used multiple types of network connections to accomplish this, including cellular, WiFi, Ethernet, and Thread. Our Golioth Firmware SDK demonstrates the feature, using MCUboot to store two copies of firmware (the currently running version, and the newly received update). Each image is cryptographically signed so it can be verified for authenticity and integrity before the device uses the new image.

Golioth applies firmware updates on the device side using the semantic version number. The Golioth servers make these update versions available based on their sequence number. Let’s unpack the differences.

Demystifying Semantic Versions and Sequence Numbers

Remember this rule of thumb: devices will always match the semantic version of a firmware release, while the Golioth servers will always advertise the most recent sequence number.

Devices match semantic version

Devices have no awareness of “newer” or “older” firmware releases. They merely check if the version of firmware currently running is an exact match for the semantic version being advertised by the server. It they do not match, the device will download the binary and update itself to the version available from the server. This might be a newer semantic version, or an older one, the only thing that matters is that the device sees a different version is available.

Golioth OTA device serial output

Device serial output indicates version 1.0.100 is currently running. The manifest received from Golioth shows main-1.0.99 as the latest version. The device begins downloading. We call this a roll back, the device is only aware that the versions don’t match and the server is the source of truth.

This is crucial in delivering the ability to “roll back” a firmware update.  The Golioth web console has a rollout button that facilitates automatic roll back when you unselect the most recently uploaded firmware.

The server advertises the most recent sequence number

The Golioth server separates OTA into two distinct parts: the artifact and the release. The artifact is the binary itself which has a package name (the default name is main) and a semantic version number. The release is created using an existing artifact, adding a time-based sequence number (new releases have higher sequence numbers) and controlling whether or not the release is rolled out to devices.

Golioth OTA Artifacts

Each of the artifacts is a unique binary with its own package name and semantic version. This fleet uses all the same hardware so no Blueprint has been assigned to these artifacts

The Golioth servers will check to ensure artifacts have unique name & version combos — you can only upload main - 1.2.3 once. The next artifact will need a different version number or package name. (The only way around this is to delete the existing artifact so you may reuse the package/version number.)

Assigning a blueprint to an Artifact makes it unique. The rules from the previous paragraph still apply, but artifacts with a blueprint will only be compared to other artifacts with the same blueprint.

Finally, the file hash uploaded with each artifact must be unique from all other artifacts. The Golioth web console will check the file, and issue an error if the same binary is uploaded more than once. This is a safety feature to help ensure that the wrong artifact isn’t uploaded by mistake.

Golioth OTA Releases

Each release requires one artifact to be assigned. Notice that the main-1.0.1 artifact has been used multiple times, targeting devices with different tags assigned. The releases are made available to devices by enabling the Rollout toggle.

Golioth releases require an artifact, but the semantic version number will not be used to decide which release is advertised to devices. Instead, the release with the newest sequence number (meaning the most recently created release) will be advertised to devices.

There are many conditions to determining which release is advertised

It is important to understand that there are several ways to target devices with a release. The most obvious is the Rollout setting—a release will only be advertised if the rollout is enabled. Package names, device blueprints, and device tags are also used to determine which releases will be advertised to any given device in your fleet.

Applying Blueprints to Firmware Updates

If all devices in your fleet use the exact same hardware, they may all be able to run the same firmware. But in many cases, a fleet will have more than one hardware variant and need more than one compiled version of the same firmware release. For instance, if you have some devices that use an nRF9160 (cellular) and others that use an NXP i.MX RT1024 (Ethernet) you must run different firmware compiled specifically for those two distinct devices.

Golioth uses device blueprints to account for this issue. When you create devices on the Golioth cloud, you can choose one device blueprint to assign to the device. The same blueprint may be selected when uploading an artifact or creating a release.

When a device has a blueprint assigned, it will receive notification of releases with the newest sequence number and the matching blueprint.

Using Tags to Target OTA

Device tags can be assigned in a similar way to device blueprints but you may assign multiple tags (blueprints are limited to a single assignment per device/artifact/release).

Device tags must match exactly to receive a release notification. This means if you have a device with two tags and a release with only one tag, the device will not be notified. That said, if you roll out a release with zero tags selected, all devices in your fleet will be notified of the release no matter what tags are assigned to those devices.

Golioth Device Summary

This summary page for a single device in our fleet shows that this is a member of the “release-candidate” tag group. The device reports which version of firmware it is currently running, as shown in this view.

Multiple releases may be created using the same artifact. As an example, at Golioth we will roll out a release that targets the release candidate tag, so that devices that we have previously identified for testing new features will receive the update but the larger fleet will not. When testing is complete and the firmware artifact is determined to behave as we expected, a second release using the same artifact will be made without any tags, prompting the rest of the fleet to download and apply the OTA update.

How Does Golioth OTA Fit Your Needs?

Wow, this is a lot. Thanks for sticking with me through this post. I hope you will agree that there is quite a bit more to OTA than just providing an update file.

We’d love to hear your feedback. We’re especially interested to know your thoughts on how our system approaches tags. The last thing a fleet manager wants is a “surprise” update to a device they weren’t expecting. This is why we’ve gone to great lengths to implement granular control, and a multitude of options. But we’re always keen on using customer feedback to improve, so let us know on the Golioth Forum or hit us up at the DevRel email address.

Until next time, happy OTA updating!

This is an excerpt from our bi-monthly newsletter, which covers our recent news and happenings around the IoT ecosystem. You can sign up for future newsletters here.

When building out the Golioth platform, we’re constantly examining the user experience. We ask ourselves why it would make sense to use Golioth over alternative solutions, whether in house or off the shelf. While we want to have the most engineers using the Golioth platform as possible, we prioritize the real goal of making it easier to build secure, efficient, and useful products. We want our users to use the best tool for the job. Let’s dig a little deeper into how we do that.

When comparing Golioth to an in-house solution, we ask our customers a simple question: Is building your own IoT platform helping you differentiate your product? While some of you may answer “yes,” the majority of folks find that focusing on the IoT infrastructure is a distraction. For them, using Golioth helps drive down costs while increasing organizational efficiency. Those are tangible metrics for us to target; they are manifested in our pricing, our seamless device on-boarding, and straightforward user management.

When we compare Golioth to off-the-shelf solutions, our outlook is somewhat unique. Rather than trying to be the last software product you will ever need, we look to be the best platform for managing devices that connect to the internet. To do that, we build differentiated device management services, such as OTA updates, instant device settings, and real-time logging—to name a few—and we heavily optimize network throughput and efficiency. For simpler IoT products, we also provide application services such as LightDB State and LightDB Stream so that you can move beyond device management to basic data storage.

At Golioth, we care about moving the industry forward without forcing our users to compromise. The IoT product landscape is complex and heterogeneous. It would be naive of us to think that Golioth would be suitable for every aspect of any one product. That’s why we curate a large ecosystem of partnerships to enable you beyond the realm of device management. Our latest partnership announcement will enable devices to talk to Memfault over a single, secure connection. We aim for Golioth to integrate with best-of-class cloud platforms and enable their usage without complicating the firmware running on-device.

Crucially, the best tool for the job may look different today than it does tomorrow, in 6 months, and in 2 years. The promise of Golioth is that as long as your devices are sending data using our firmware SDK, you will have the flexibility to change where that data is going, how it is being processed, and how it ultimately is presented to the end-user, all without changing a line of firmware code. Golioth Output Streams currently enables this, but over the next few months, we will announce an even more robust set of features in this area, starting with our Memfault integration, which you can sign up to be notified about here.

At Golioth, we pride ourselves on doing both the deep technical work—as well as the routine maintenance—required to ensure we back up our claim of being the most efficient way that devices in the field can securely talk to services in the cloud.

In service of this mission, last week we turned on DTLS 1.2 Connection ID support in Golioth Cloud. While most users will not see any visible differences in how their devices connect to Golioth, behind the scenes many devices are now enabled to perform more efficiently. This is especially true for those connecting via cellular networks.

Devices using Connection ID can see a meaningful impact on bandwidth costs, power usage, and battery life. This not only benefits users of Golioth, but also means that more devices are able to operate for longer, reducing waste and energy use. Because we believe deeply in this potential widespread impact, we have been and will continue to do all work in this area in the open.

What are Connection IDs?

We have written previously about how we use CoAP over DTLS / UDP to establish secure connections between Golioth and devices. Because UDP is a connection-less protocol, attributes of each datagram that arrive at a local endpoint must be used to identify the remote endpoint on the other side.

In short, DTLS uses the IP address and port of the remote endpoint to identify connections. This is a reasonable mechanism, but some devices change IP address and port frequently, resulting in the need to constantly perform handshakes. In some scenarios, a full handshake may be required just to send a single packet. Performing these handshakes can have a negative impact on battery life, drive up bandwidth costs, and increase communication latency. For some devices, this cost makes the end product at best more expensive, and at worst, infeasible.

Connection IDs provide an alternative solution, allowing for clients and servers to negotiate an identifier that can be used in lieu of the IP address and port. Once negotiated, the endpoint(s) that are sending a Connection ID — typically just the client but sometimes both the client and the server — are still able to have their DTLS connection state associated with incoming records when their IP address and port changes. The result is that a device could travel from one side of the world to the other, continuing to communicate with the same server, while only performing the single initial handshake.

How was Connection ID support implemented?

Efficient communication between devices and the cloud requires compatible software on both sides of the connection.

Cloud Changes

On the cloud side, we are beneficiaries of the work done by folks in the Pion community, which includes a set of libraries for real-time web-based communication. Conventional use of these libraries is enabling video streaming applications on the internet, such as Twitch. However, the protocols they implement are useful in many constrained environments where the network is lossy or unreliable.

The Golioth team contributed Connection ID support in the pion/dtls library. This consisted of both the implementation of Connection ID extension handling and modifications to UDP datagram routing. The former involved careful updates to the parsing of DTLS records; Connection IDs change record headers from being fixed length to variable length. As part of this work, we are also adding a very small new API in the Golang crypto library.

Previously, pion/dtls utilized an underlying net.Listener, which was conventionally supplied by the pion/transport library. This UDP net.Listener handed net.Conn connections to the pion/dtls listener, which would in turn establish a DTLS connection by performing a handshake with the client on the other side of the connection. However, the net.Conn interface does not allow for consumers to change the remote IP address and port to which packets are sent. When not using Connection IDs, this is not an issue because the IP address and port are what identifies the connection. However, when Connection IDs are in use, the ID itself is used to identify the connection, and the remote IP address and port may change over time. Thus, a new interface, net.PacketListener, was added to pion/dtls, which enables the changing of the remote address of a connection, and an implementation of the interface that routes based on Connection IDs when present was supplied.

Device changes

On the device side, most users leverage the Golioth Firmware SDK to communicate with Golioth services, such as OTA, Settings, Stream, and more. The SDK is meant to work with any hardware, which is why we integrate with platforms such as Zephyr, Espressif ESP-IDF, Infineon Modus Toolbox, and Linux. Many of these platforms utilize DTLS support offered by mbedTLS, which added support for the IETF draft of Connection IDs in 2019, then included official support in the 3.3.0 release in 2022. The SDK uses libcoap, which implements CoAP support on top of a variety of DTLS implementations, including mbedTLS. libcoap started consuming mbedTLS’s Connection ID API in July of this year. We have been assisting in ensuring that new versions of libcoap are able to build on the platforms with which we integrate.

However, these platforms frequently maintain their own forks of dependencies in order to integrate with the rest of their ecosystem. We have both been contributing and supporting others contributions wherever possible in order to expedite the use of Connection IDs in the Golioth Firmware SDK. With Connection ID support already in place in ESP-IDF and Nordic’s sdk-nrf, and coming soon in the next Zephyr release, we hope to turn them on by default for all platforms in upcoming Golioth Firmware SDK releases.

How do I use Connection IDs?

Using the Golioth Firmware SDK is the only requirement to utilize Connection IDs. As support is added across all embedded platforms, update to the latest version in your device side code and your connection will automatically be enabled. The video at the top of this post shows how you can inspect your own device traffic to see the functionality in action.

What comes next?

In total, we have made contributions across many open source projects as part of our effort to make Connection IDs widely available, and we look forward to continuing to lend a hand wherever possible. While this post has provided a brief overview of DTLS, Connection IDs, and the ecosystem of libraries we integrate with, ongoing use of the functionality will allow us to provide tangible measures of its impact. We’ll make sure to make announcements as support continues to be added across platforms and consumed in the Golioth Firmware SDK.

Until then, go create a Golioth account and start connecting your devices to the cloud!

 

As embedded developers, we’re consistently seeking ways to make our processes more efficient and our teams more collaborative. The magic ingredient? DevOps.Its origins stem from the need to break down silos between development (Dev) and operations (Ops) teams. DevOps fosters greater collaboration and introduces innovative processes and tools to deliver high-quality software and products more efficiently.

In this talk, we will explore how we can bring the benefits of DevOps into the world of IoT. We will focus on using GitHub Actions for continuous integration and delivery (CI/CD) while also touching on how physical device operations like shipping & logistics which can be streamlined using a DevOps approach.

Understanding the importance of DevOps in IoT is crucial to unlocking efficiencies and streamlining processes across any organization that manages connected devices. This talk, originally given at the 2023 Embedded Online Conference (EOC), serves as one of the many specialized talks freely accessible on the EOC site.

GitHub Actions for IoT

To illustrate how to put these concepts into practice, we’re going to look at a demo using an ESP32 with a feather board and Grove sensors for air quality monitoring. It’s important to note that while we utilize GitHub Actions in this instance, other CI tools like Jenkins or CircleCI can also be effectively used in similar contexts based on your team’s needs and preferences.

For this example we use GitHub Actions to automate the build and deployment process.

The two main components of our GitHub Actions workflow are ‘build’ and ‘deploy’ jobs. The ‘build’ job uses the pre-built GitHub Action for ESP-IDF to compile our code, and is triggered when a new tag is pushed or when a pull request is made. The ‘deploy’ job installs the Golioth CLI, authenticates with Golioth, uploads our firmware artifact, and releases that artifact to a set of devices over-the-air (OTA).

Imagine an organization that manages a fleet of remote air quality monitors across multiple cities. This GitHub Actions workflow triggers the build and deployment process automatically when the development team integrates new features or bug fixes into the main branch and tags the version. The updated firmware is then released and deployed to all connected air quality monitors, regardless of their location, with no additional logistics or manual intervention required. This continuous integration and deployment allows the organization to respond rapidly to changes and ensures that the monitors always operate with the latest updates.

Let’s delve into the GitHub Actions workflow and walk through each stage:

  1. Trigger: The workflow is activated when a new tag is pushed or a pull request is created.
    on:
      push:
        # Publish semver tags as releases.
        tags: [ 'v*.*.*' ]
      pull_request:
        branches: [ main ]
  2. Build: The workflow checks out the repository, builds the firmware using the ESP-IDF GitHub Action, and stores the built firmware artifact.
    jobs:
      build:
        runs-on: ubuntu-latest
        steps:
        - name: Checkout repo
          uses: actions/checkout@v3
          with:
            submodules: 'recursive'
        - name: esp-idf build
          uses: espressif/esp-idf-ci-action@v1
          with:
            esp_idf_version: v4.4.4
            target: esp32
            path: './'
          env:
            WIFI_SSID: ${{ secrets.WIFI_SSID }}
            WIFI_PASS: ${{ secrets.WIFI_PASS }}
            PSK_ID: ${{ secrets.PSK_ID }}
            PSK: ${{ secrets.PSK }}
        - name: store built artifact
          uses: actions/upload-artifact@v3
          with:
            name: firmware.bin
            path: build/esp-air-quality-monitor.bin
  3. Deploy: The workflow installs the Golioth CLI, authenticates with Golioth, downloads the built firmware artifact, and uploads it to Golioth for OTA updates.
    jobs:
      build:
        runs-on: ubuntu-latest
        steps:
        - name: Checkout repo
          uses: actions/checkout@v3
          with:
            submodules: 'recursive'
        - name: esp-idf build
          uses: espressif/esp-idf-ci-action@v1
          with:
            esp_idf_version: v4.4.4
            target: esp32
            path: './'
          env:
            WIFI_SSID: ${{ secrets.WIFI_SSID }}
            WIFI_PASS: ${{ secrets.WIFI_PASS }}
            PSK_ID: ${{ secrets.PSK_ID }}
            PSK: ${{ secrets.PSK }}
        - name: store built artifact
          uses: actions/upload-artifact@v3
          with:
            name: firmware.bin
            path: build/esp-air-quality-monitor.bin

For those eager to dive in and start implementing DevOps into their own IoT development process, we’ve provided an example Github Actions workflow file on GitHub. Feel free to fork this repository and use it as a starting point for streamlining your own IoT firmware development process. Remember, the best way to learn is by doing. So, get your hands dirty, experiment, iterate, and innovate. If you ever need help or want to share your experiences, please reach out in our our community forum.

Today, Golioth is open-sourcing an early version of xk6-coap, a Grafana k6 extension that enables authoring load testing scenarios in Javascript that interact with Constrained Application Protocol (CoAP) endpoints. k6 is a load testing framework from Grafana Labs focused on extensibility, scalability, and developer experience.

The Importance of Load Testing

Load testing is the practice of simulating interaction with a system in order to observe how the system responds. For many applications, this manifests as orchestrating a large number of network connections to the system, each sending a pre-defined amount of data, or continuously sending data over some period of time. While simple load tests can be carried out by writing a minimal program that executes the specified behavior, it is frequently desirable to run the tests in a distributed fashion, either due to scalability requirements of the tests themselves, or to simulate geographically distributed interaction. As the number and complexity of tests increases, introducing a load testing framework can enable an organization to rapidly evolve the sophistication of their testing efforts.

Why Use k6?

When you operate a platform that needs to be able to accommodate millions of IoT devices, testing your system to ensure that it is able to respond appropriately to many different scenarios is critical. When we were thinking about how to reduce the friction of running these tests, k6 was a natural fit due to its use of Javascript as a scripting language, its built-in support for modifying the magnitude and duration of tests, and its ability to scale from a developer’s local machine to a distributed system like Kubernetes.

Fortunately, k6 also has a rich ecosystem of extensions, along with straightforward machinery to build your own.

Why a CoAP Extension?

Many of the devices connecting to Golioth are communicating over constrained networks. As described in a previous post, CoAP is a flexible protocol that allows these devices to operate efficiently in a wide range of environments. However, while well-supported by embedded RTOS’s, server-side support is somewhat more limited. As we strive to build scalable, resilient systems, we hope to also help grow the ecosystem of tooling by contributing back to the open source community.

Getting Involved

We have decided to open source xk6-coap while it is still in active development in order to allow for the community to influence the direction of the project, and contribute if so inclined. To start running your own CoAP load tests, check out the getting started steps in the README.md. If interested in contributing, take a look at the open issues and the CONTRIBUTING.md.

In the last post in our series on the Constrained Application Protocol (CoAP), we explored some of the trade-offs between reliable and unreliable data transmission. We also covered why the CoAP’s flexibility makes it a good choice for Golioth and constrained applications in general. In this post, we’ll dig deeper into the lifecycle of a CoAP message. What has to happen for data to get from a device in the field to a service in the cloud? What about back again? As we traverse that path, we will describe each protocol along the way.

Setting Up

Throughout this post, we will use the Linux port of the Golioth Firmware SDK at v0.7.0 to communicate from a local machine to Golioth. Specifically, we’ll be examining traffic from the Golioth Basics sample application, which currently uses pre-shared keys (PSKs) for authentication. This is a simpler model than those used by most constrained devices when communicating to Golioth, but it is useful for detailing the general sequence of operations.

We are also going to cheat a bit in this post by describing only communication from OSI Layer 3 and up, but I promise I’ll make it up to you in a painfully lengthy post about framing and modulation in the future.

We’ll use Wireshark to observe network traffic. If you want to view the decrypted CoAP payloads captured by Wireshark, you’ll need to go to Edit > Preferences > Protocols > DTLS then add your PSK in hex form. This can be obtained with the following command.

$ echo -n "YOUR_PSK_HERE" | xxd -ps -c 32

If you would like to follow along with the exact packet capture that we use in this post, you can find it, along with instructions for how to import into Wireshark, in the hasheddan/coap-pcap repository.

Life Before CoAP

When interacting with Golioth via the firmware SDK, it appears as though communication begins when the first CoAP message is sent. However, a number of steps are required before two endpoints can communicate over a Layer 7 protocol like CoAP.

Address Resolution

Because humans need to interact with and write programs that interact with the internet, we need to be able to specify the host with a “friendly” name. These names are referred to as domain names on the internet. For example, to read this blog post, you navigated to blog.golioth.io. However, your computer doesn’t know how to find this blog using that name, so it instead needs to translate the name to an address. The common analogy here would be telling my maps app that I want to go to my local coffee shop, Press, to get a crepe. The app needs to translate Press to an address of a physical location before using GPS to navigate there.

The Global Positioning System (GPS) is a protocol that we are not going to talk about at length in this post, but it is equally fascinating as those that we are.

This translation step not only allows us to use more friendly names when we talk about a destination, it also allows that destination to change its physical location without all other services needing to change how they find it. On the internet, the protocol that enables this translation is the Domain Name System (DNS).

Description of the Domain Name System (DNS) protocol with PDU structure.

The Golioth firmware SDK defines the URI of the Golioth server with the CONFIG_GOLIOTH_COAP_HOST_URI config value.

#ifndef CONFIG_GOLIOTH_COAP_HOST_URI
#define CONFIG_GOLIOTH_COAP_HOST_URI "coaps://coap.golioth.io"
#endif

Source

This value is parsed into the coap.golioth.io domain name when the client is created and a session is established.

// Split URI for host
coap_uri_t host_uri = {};
int uri_status = coap_split_uri(
        (const uint8_t*)CONFIG_GOLIOTH_COAP_HOST_URI,
        strlen(CONFIG_GOLIOTH_COAP_HOST_URI),
        &host_uri);
if (uri_status < 0) {
    GLTH_LOGE(TAG, "CoAP host URI invalid: %s", CONFIG_GOLIOTH_COAP_HOST_URI);
    return GOLIOTH_ERR_INVALID_FORMAT;
}

// Get destination address of host
coap_address_t dst_addr = {};
GOLIOTH_STATUS_RETURN_IF_ERROR(get_coap_dst_address(&host_uri, &dst_addr));

Source

The lookup is ultimately performed by a call to getaddrinfo.

struct addrinfo hints = {
        .ai_socktype = SOCK_DGRAM,
        .ai_family = AF_UNSPEC,
};
struct addrinfo* ainfo = NULL;
const char* hostname = (const char*)host_uri->host.s;
int error = getaddrinfo(hostname, NULL, &hints, &ainfo);

Source

While DNS, like many Layer 7 protocols, can be used over a variety of underlying transports, it typically uses UDP.

Description of the User Datagram Protocol (UDP) with a diagram.

Any DNS messages we send are encapsulated in the payload of a UDP datagram, which is supplemented with

  • Source port
  • Destination port
  • Length
  • Checksum

This information is in the header. The ports inform what service at the destination the data should be routed to. DNS uses port 53, so the destination port in our UDP datagram header should be 53. However, we still haven’t specified which resolver we want to send the query to. In this case, we need to know the physical address because we can’t ask the resolver to resolve addresses for us if we can’t resolve its address in the first place.

On the internet, these addresses are known as Internet Protocol (IP) addresses.

Card describing the Internet Protocol (IP) with a diagram. The resolver chain is highly dependent on the configuration of the system in use. My local Ubuntu system uses a local resolver, systemd-resolve, which is listening on port 53.

$ sudo netstat -ulpn | grep "127.0.0.53:53"
udp        0      0 127.0.0.53:53           0.0.0.0:*                           827/systemd-resolve

The first packets we see in Wireshark after running the Golioth basics example correspond to attempted DNS resolution of coap.golioth.io.Messages in a DNS recursive lookup.

The first two requests are to systemd-resolve, one for the IPv4 record (A) and one for the IPv6 record (AAAA). systemd-resolve subsequently makes the same requests from my local machine (192.168.1.26) to the router on my home network (192.168.1.1). The response from the router is then returned to systemd-resolve, which returns the answer to our program. Breaking apart the first query message, we can see our three layers of encapsulation.

The answer in the second to last packet for the coap.golioth.io IPv4 address contains the expected IP address, as confirmed by a simple dig query.

$ dig +noall +answer coap.golioth.io
coap.golioth.io.	246	IN	A	34.135.90.112

Establishing a Secure Channel

At Golioth, we believe that all data transmission should be secure. In fact, we go so far as to not allow for sending data to the Golioth platform unless it is over a secure channel. Your browser hopefully shows a lock symbol to the left of your address bar right now. That indicates that your request for this web page, and the content of the page that was sent back by the server, happened over a secure channel. This secure channel was established using Transport Layer Security (TLS). However, TLS was designed to run on TCP, and thus requires the presence of a reliable transport, which UDP does not provide. In order to enable the same security over UDP, Datagram Transport Layer Security (DTLS) was developed.

Description of the Datagram Transport Level Security (DTLS) protocol with a diagram.

Some of the differences that DTLS introduces to TLS include:

  • The inclusion of an explicit Sequence Number – DTLS must provide a facility for reordering records as UDP does not do so automatically.
  • The addition of retransmission timers – DTLS must be able to retransmit data in the event that it never arrives at its destination.
  • Constraints around fragmentation – While multiple DTLS records may be placed in a single datagram, a single record may not be fragmented across multiple datagrams.
  • Removal of stream-based ciphers – TLS 1.2 used RC4 as its stream-based cipher, which does not allow for random access and thus cannot be utilized over an unreliable transport.
  • Guardrails against denial-of-service attacks – Datagram protocols are highly susceptible to denial of service attacks due to the fact that a connection does not need to be established prior to sending data. DTLS implements a stateless cookie to help guard against this threat.

The DTLS handshake protocol consists of a sequence of records being sent between the client and server. The number and type of records depends on the DTLS implementation and configuration on each side. Returning to our Wireshark capture, we can see the exact sequence used in the Golioth basics example.

Let’s explore each step.

Client Hello

The Client Hello message is how a client initiates a DTLS handshake with a server. The content type of the record is set to Handshake (22) to indicate we are using the handshake protocol, and the embedded handshake structure is included as the fragment. We immediately see a few fields that are present in the DTLS specification that are not in TLS. Namely, the Epoch and Sequence Number in the record layer structure, and the Message Sequence, Fragment Offset , and Fragment Length fields in the handshake structure. These are all introduced to accommodate for the fact that we are transmitting over UDP, rather than TCP.

The message additionally includes information about how the client wishes to communicate, such as the DTLS Version, supported Cipher Suites, and supported Extensions.

Hello Verify Request

The Hello Verify Request is a new handshake type introduced in DTLS, which includes a stateless cookie and that is added to guard against denial-of-service concerns.

From the DTLS v1.2 RFC:

This mechanism forces the attacker/client to be able to receive the cookie, which makes DoS attacks with spoofed IP addresses difficult. This mechanism does not provide any defense against DoS attacks mounted from valid IP addresses.

Though servers are not required to send a Hello Verify Request, if they do, the client is required to send the Client Hello message again with the cookie included. We can see this behavior in the subsequent message.

Server Hello, Server Hello Done

The next packet is interesting because the UDP datagram contains two DTLS records, which is explicitly allowed by the RFC:

Multiple DTLS records may be placed in a single datagram. They are simply encoded consecutively. The DTLS record framing is sufficient to determine the boundaries. Note, however, that the first byte of the datagram payload must be the beginning of a record. Records may not span datagrams.

The Server Hello is a response to the Client Hello that indicates which of the functionality supported by the client should be used. For example, the server will select a Cipher Suite that is supported by both sides.

The Server Hello Done message indicates that the server is done sending messages. In this case we are using PSKs, so handshake messages for Server Certificate, Server Key Exchange, and Certificate Request are not required. However, in cases where they are, the client knows the server has sent all messages by the receiving of the Server Hello Done.

Client Key Exchange, Change Cipher Spec, Finished

The next packet includes three DTLS records. In the Client Key Exchange, the client informs the server which PSK will be used by providing a PSK ID.

The Change Cipher Spec message tells the server “we’ve negotiated parameters for communication, and I am going to use it starting with my next message”. That next message is the Finished record, which includes Verify Data encrypted using the negotiated TLS_PSK_WITH_AES_128_GCM_SHA256 scheme.

Change Cipher Spec, Finished

Finally, the server responds with its own Change Cipher Spec and Finished messages for the client to verify.

With a secure channel in place, we are now ready to send CoAP messages!

Sending a CoAP Message

Sending a CoAP message is not so different than sending DTLS handshake protocol messages. However, instead of Content Type: Handshake (22), we’ll be sending Application Data (23). The first message we see in the Wireshark capture is the log we emit while setting up the client in the Golioth basics program.

GLTH_LOGI(TAG, "Waiting for connection to Golioth...");

This capture shows the entire encapsulation chain: a DTLS record in a UDP datagram in an IP packet. Within the encrypted DTLS record payload, which we are able to inspect after supplying our PSK in Wireshark, we can see the content of a CoAP message.

Description of the Constrained Application Protocol (CoAP) with a diagram.

Messages are not to be confused with requests and responses. CoAP maps its request / response model to the Hypertext Transfer Protocol (HTTP), with requests providing a Method Code and responses providing a Response Code. The response to a Confirmable request message may be included in the corresponding Acknowledgement if it is immediately available. This is referred to as piggybacking. The message Token (T) is used to correlate a response to a request, meaning that when piggybacking is employed, both the request and response have the same Message ID and Token.

The log message shown above is specified as Confirmable with Message ID: 43156 and Token: 7d5b825d. The method code specifies that the request is a POST (2) , while the options specify the logs URI path and that the payload is JSON data.

Golioth should respond to this message with an Acknowledgement with a corresponding Message ID.

Not only does it do so, but it also employs piggybacking to supply a response Code (2.03 Valid (67)) and a Token matching that of the request.

How Does It Stack Up?

Though we steered clear of the data link and physical layers in this post, there is a whole world of hidden complexity yet to be uncovered beneath IP packets. The tooling and processes utilized in this post will enable both the exploration of those lower level protocols, as well as comparison between CoAP / DTLS / UDP / IP with other L3-L7 stacks. Check back for upcoming posts evaluating the similarities and differences with MQTT, QUIC, and more!

While you wait, create an account on Golioth and get your devices talking to the cloud! And if you feel so inclined, break out Wireshark and see what else you can learn.