Ever wondered how to upload big chunks of data from IoT devices? Images are a great example, even a relatively small (and well compressed) photo is pretty large when speaking in terms of low-power, remote devices. The solution is to have the device break up the data into blocks that the server will reassemble into a file. The good news is that Golioth has already taken care of this! Today, we’ll walk through how to upload images from IoT devices using block upload.

Overview

There’s nothing particularly special about images, but they are usually fairly large; on the order of tens or hundreds of kilobytes. Golioth’s block upload works for whatever data you want to send, as we’re simply streaming bytes back to the server. Here’s what’s involved:

  1. Set up a Golioth Pipeline to configure how your data upload will be routed on the cloud
  2. Capture an image (or other large hunk of data)
  3. Write a callback function to supply the blocks of data
  4. Call golioth_stream_set_blockwise_sync(), supplying your callback

This really is all it takes to push lots of data up to the cloud from a constrained device.

As part of our AI Summer, I put together an example application that captures images from a camera module and uploads them to Golioth. It uses Zephyr and runs on an Espressif ESP32, Nordic nRF9160dk, or NXP mimxrt1024-evk.

1. Set up a pipeline

We’ll be sending data to Golioth’s Stream service so we need to add a pipeline to tell Golioth what to do with that data. For this example, we’re taking the received image and routing it to an Amazon S3 bucket. There is already a pipeline included in the example:

filter:
  path: "/file_upload*"
  content_type: application/octet-stream
steps:
  - name: step0
    destination:
      type: aws-s3
      version: v1
      parameters:
        name: golioth-pipelines-test
        access_key: $AWS_S3_ACCESS_KEY
        access_secret: $AWS_S3_ACCESS_SECRET
        region: us-east-1

Add this to your project by navigating to the Golioth web console, using the left sidebar to select Pipelines, and clicking the Create button. Don’t forget to enable your pipeline after pasting and saving it. You’ll need to add an access_key and access_secret for your S3 bucket to the Credentials→Secrets in your Golioth console.

Of course S3 is just one of many destinations. Check out the Destinations page of the Golioth Docs for more.

2. Capture an Image

The particulars of capturing an image with an embedded device are beyond the scope of this post. If you want to follow along exactly with the example, there’s information for purchasing and connecting a camera in the README file.

An electronic development board connected to a camera module with a few colorful wires

nRF9160dk connected to an Arducam Mega 5MP-AF

However, this example also demonstrates block upload of a large text file stored as a C header file. You can simply comment out the camera initialization code and use button 2 on your dev board to send the text file instead.

3. Write a Callback Function for Block Upload

This is where most of the work happens. The Golioth Firmware SDK will call this function each time it needs a block of data. You need to write your own function to fill a buffer with the correct block. Here’s what the function prototype looks like:

enum golioth_status block_upload_camera_image_cb(uint32_t block_idx,
                                                 uint8_t *block_buffer,
                                                 size_t *block_size,
                                                 bool *is_last,
                                                 void *arg)

Callback Arguments

Let’s do a quick explainer on what each of these functions represents:

  • block_idx: the block number to be sent to the server, starting at 0
  • block_buffer: a pointer to memory where you should copy your data
  • block_size: the size of the block–this dictates how many bytes you should copy into the buffer and should be reset to a smaller number if your last block is not this large
  • is_last: set this to true if you are sending the last block
  • arg: pointer to custom data you supply when beginning the block upload

Working example

The callback for uploading text files is a good place to learn about how this system works. I’ve simplified the function, removed all of the error checking and logging, and will use line numbers to walk through the code.

static enum golioth_status block_upload_read_chunk(uint32_t block_idx,
                                                   uint8_t *block_buffer,
                                                   size_t *block_size,
                                                   bool *is_last,
                                                   void *arg)
{
    size_t bu_offset = block_idx * bu_max_block_size;
    size_t bu_size = fables_len - bu_offset;

    if (bu_size <= block_size)
    {
        /* We run out of data to send after this block; mark as last block */
        *block_size = bu_size;
        *is_last = true;
    }

    /* Copy data to the block buffer */
    memcpy(block_buffer, fables + bu_offset, *block_size);
    return GOLIOTH_OK;
}
  • Line 7: calculate the offset to read source data by multiplying the block size by the block index number.
  • Line 8: calculate how many bytes of source data remain by subtracting the offset from the data source length.
  • Line 10-15: Check if the remaining data is smaller than the block size. If so we need to update block_size and is_last to indicate this is the final block.
  • Line 18: Copy the data into the block_buffer.
  • Line 19: Return a status code indicating block data is ready to send.

In reality there is more error checking to be done, and if a problem is encountered the block should be marked as last and an error code returned. View the entire function from the example.

Special cases for using an arg and reading from the camera

Two things are worth calling out in this step: user args and special camera memory.

The text upload in the example uses a struct to pass in the source data buffer and its length. This way, different files/data structures can use the same block upload callback. The struct indicating the data is passed in as a user argument and accessed inside the callback.

Uploading images from this particular camera is also a special case. The camera itself has RAM where a captured image is stored. In the example, that memory is read each time the block upload callback runs.

4. Call golioth_stream_set_blockwise_sync()

This is the easy part, we just need to trigger the block upload.

camera_capture_image(cam_mod);
err = golioth_stream_set_blockwise_sync(client,
                                        "file_upload",
                                        GOLIOTH_CONTENT_TYPE_OCTET_STREAM,
                                        block_upload_camera_image_cb,
                                        (void *) cam_mod);

Above is the code used by the example to capture an image and upload it. The cam_mod variable is a pointer to the camera module. Notice that it is passed as the user argument (the final parameter) when beginning the upload. This way, the block_upload_camera_image_cb() callback will have a pointer to the camera it can use to read image data and copy to the block_buffer for upload.

Once this runs, the image will be uploaded to Golioth and the pipeline will route it to an S3 bucket. Remember, devices connect to Golioth with CoAP, so you get the benefit of an efficient protocol and the ease of using the Firmware SDK to handle transmission to the cloud.

Make IoT Data Easy to Work With

Getting data to and from a large fleet of IoT devices has long been a tricky process. We don’t think that every company should have to reinvent the wheel. Golioth has built the universal connector for your IoT fleet, take it for a spin today!

Using the Hugging Face Inference API for Device Audio Analysis

Golioth Pipelines works with Hugging Face, as shown in our recent AI launch. This post will highlight how to use an audio classification model on Hugging Face that accepts data recorded on a microcontroller-based device, sent over a secure network connection to Golioth, and routed through Pipelines.

While most commonly known as the place where models and data sets are uploaded and shared, Hugging Face also provides a compute service in the form of its free serverless inference API and production-ready dedicated inference endpoints. Unlike other platforms that offer only proprietary models, Hugging Face allows access to over 150,000 open source models via its inference APIs. Additionally, private models can be hosted on Hugging Face, which is a common use case for Golioth users that have trained models on data collected from their device fleets.

Audio Analysis with Pipelines

Because the Hugging Face inference APIs use HTTP, they are easy to target with the webhook transformer. The structure of the request body will depend on the model being invoked, but for models that operate on media files, such as audio or video, the payload is typically raw binary data.

In the following pipeline, we target the serverless inference API with an audio sample streamed from a device. In this scenario, we want to perform sentiment analysis of the audio, then pass the results onto Golioth’s timeseries database, LightDB Stream, so that changes in sentiment can be observed over time. An alternative destination, or multiple destinations, could easily be added.

Click here to use this pipeline in your project on Golioth.

filter:
  path: "/audio"
steps:
  - name: emotion-recognition
    transformer:
      type: webhook
      version: v1
      parameters:
        url: https://api-inference.huggingface.co/models/superb/hubert-large-superb-er
        headers:
          Authorization: $HUGGING_FACE_TOKEN
  - name: embed
    transformer:
      type: embed-in-json
      version: v1
      parameters:
        key: text
  - name: send-lightdb-stream
    destination:
      type: lightdb-stream
      version: v1

Note that though Hugging Face’s serverless inference API is free to use, it is rate-limited and subject to high latency and intermittent failures due to cold starts. For production use-cases, dedicated inference endpoints are recommended.

We can pick any supported model on Hugging Face for our audio analysis task. As shown in the URL, the Hubert-Large for Emotion Recognition model is targeted, and the audio content delivered on path /audio is delivered directly to Hugging Face. An example for how to upload audio to Golioth using an ESP32 can be found here.

Results from the emotion recognition inference look as follows.

[
  {
    "score": 0.6310836672782898,
    "label": "neu"
  },
  {
    "score": 0.2573806643486023,
    "label": "sad"
  },
  {
    "score": 0.09393830597400665,
    "label": "hap"
  },
  {
    "score": 0.017597444355487823,
    "label": "ang"
  }
]

Expanding Capabilities

Countless models are uploaded to Hugging Face on a daily basis, and the inference API integration with Golioth Pipelines makes it simple to incorporate the latest new functionality into any connected device product. Let us know what models you are using on the Golioth Forum!

Golioth for A Cover

Today, we are thrilled to announce the launch of Golioth for AI, a comprehensive set of features designed to simplify and enhance the integration of AI into IoT products.

At Golioth, we envision a future where AI and IoT converge to create smarter, more efficient systems that can learn, adapt, and improve over time. The fusion of AI and IoT has the potential to unlock unprecedented levels of innovation and automation across various industries. However, integrating AI into IoT devices can be complex and challenging, requiring robust solutions for managing models, training data, and performing inference.

Today, at Golioth, we are addressing these challenges head-on. Our new set of features focuses on three core pillars: training data, model management, and inference. By streamlining these processes, we aim to empower developers and businesses to quickly add AI to their IoT projects, where it was not readily possible to do so before.

Training Data: Unlocking the Potential of IoT Data

At Golioth, we recognize that IoT devices generate rich, valuable data that can be used to train innovative AI models. However, this data is often inaccessible, in the wrong format, or difficult to stream to the cloud. We’re committed to helping teams extract this data and route it to the right destinations for training AI models that solve important physical world problems.

We’ve been building up to this with our launch of Pipelines, and new destinations and transformers have been added every week since. Learn more about Pipelines in our earlier announcement.

In v0.14.0 of our Firmware SDK, we added support for block-wise uploads. This new capability allows for streaming larger payloads, such as high-resolution images and audio, to the cloud. This unlocks incredible potential for new AI-enabled IoT applications, from connected cameras streaming images for security and quality inspection to audio-based applications for voice recognition and preventative maintenance for industrial machines.

For an example of uploading images or audio files to Golioth for training, see:

We’ve recently added three new object storage destinations for Pipelines:

These storage solutions are perfect for handling the rich media data essential for training AI models, ensuring your training set is always up to date with live data from in-field devices.

Partnership with Edge Impulse

Today, we’re also excited to announce our official partnership with Edge Impulse, a leading platform for developing and optimizing AI for the edge. This partnership allows streaming of IoT data from Golioth to Edge Impulse for advanced model training, fine-tuned for microcontroller class devices. Using Edge Impulse’s AWS S3 Data Acquisition, you can easily integrate with Golioth’s AWS S3 Pipelines destination by sharing a bucket for training data:

filter:
  path: "*"
  content_type: application/octet-stream
steps:
  - name: step-0
    destination:
      type: aws-s3
      version: v1
      parameters:
        name: my-bucket
        access_key: $AWS_ACCESS_KEY
        access_secret: $AWS_ACCESS_SECRET
        region: us-east-1

This streamlined approach enables you to train cutting-edge AI models efficiently, leveraging the power of both the Golioth and Edge Impulse platforms. For a full demonstration of using Golioth with Edge Impulse see: https://github.com/edgeimpulse/example-golioth

Model Management: Flexible OTA Updates

Deploying AI models to devices is crucial for keeping them updated with the latest capabilities and adapting to new use cases and edge cases as they arise. However, deploying a full OTA firmware update every time you need to update your AI model is inefficient and costly.

To address this, Golioth’s Over-the-Air (OTA) update system has been enhanced to support a broader range of artifact types, including AI models and media files.

Golioth Console Display OTA AI Model Releases

Our updated OTA capabilities ensure that your AI models can be deployed and updated independently of firmware updates, making the process more efficient and streamlined. This allows models to be updated without having to perform a complete firmware update, saving bandwidth, reducing battery consumption, and minimizing downtime while ensuring your devices always have the latest AI capabilities.

We’ve put together an example demonstrating deploying a TensorFlow Lite model here: Deploying TensorFlow Lite Model.

Flox Robotics is already leveraging our model management capabilities to deploy AI models that detect wildlife and ensure their safety. Their AI deterrent systems prevent wildlife from entering dangerous areas, significantly reducing harm and preserving ecosystems. Read the case study.

Inference: On-Device and in the Cloud

Inference is the core of AI applications, and with Golioth, you can now perform inference both on devices and in the cloud. On-device inference is often preferred for applications like real-time monitoring, autonomous systems, and scenarios where immediate decision-making is critical due to its lower latency, reduced bandwidth usage, and ability to operate offline.

However, sometimes inference in the cloud is ideal or necessary for tasks requiring significant processing power, such as high-resolution image analysis, complex pattern recognition, and large-scale data aggregation, leveraging more powerful computational resources and larger models.

Golioth Pipelines now supports inference transformers and destinations, integrating with leading AI platforms including Replicate, Hugging Face, OpenAI, and Anthropic. Using our new webhook transformer, you can leverage these platforms to perform inference within your pipelines. The results of the inference are then returned back to your pipeline to be routed to any destination. Learn more about our new webhook transformer.

Here is an example of how you can configure a pipeline to send audio samples captured on a device to the Hugging Face Serverless Inference API, leveraging a fine-tuned HuBERT model for emotion recognition. The inference results are forwarded as timeseries data to LightDB Stream.

filter:
  path: "/audio"
steps:
  - name: emotion-recognition
    transformer:
      type: webhook
      version: v1
      parameters:
        url: https://api-inference.huggingface.co/models/superb/hubert-large-superb-er
        headers:
          Authorization: $HUGGING_FACE_TOKEN
  - name: embed
    transformer:
      type: embed-in-json
      version: v1
      parameters:
        key: text
  - name: send-lightdb-stream
    destination:
      type: lightdb-stream
      version: v1

Golioth is continually releasing new examples to highlight the applications of AI on device and in the cloud. Here’s an example of uploading an image and configuring a pipeline to describe the image with OpenAI and send the transcription result to Slack:

filter:
  path: "/image"
steps:
  - name: jpeg
    transformer:
      type: change-content-type
      version: v1
      parameters:
        content_type: image/jpeg
  - name: url
    transformer:
      type: data-url
      version: v1
  - name: embed
    transformer:
      type: embed-in-json
      version: v1
      parameters:
        key: image
  - name: create-payload
    transformer:
      type: json-patch
      version: v1
      parameters:
        patch: |
          [
            {
              "op": "add",
              "path": "/model",
              "value": "gpt-4o-mini"
            },
            {
              "op": "add",
              "path": "/messages",
              "value": [
                {
                  "role": "user",
                  "content": [
                    {
                      "type": "text",
                      "text": "What's in this image?"
                    },
                    {
                      "type": "image_url",
                      "image_url": {
                        "url": ""
                      }
                    }
                  ]
                }
              ]
            },
            {
              "op": "move",
              "from": "/image",
              "path": "/messages/0/content/1/image_url/url"
            },
            {
              "op": "remove",
              "path": "/image"
            }
          ]
  - name: explain
    transformer:
      type: webhook
      version: v1
      parameters:
        url: https://api.openai.com/v1/chat/completions
        headers:
          Authorization: $OPENAI_TOKEN
  - name: parse-payload
    transformer:
      type: json-patch
      version: v1
      parameters:
        patch: |
          [
            {"op": "add", "path": "/text", "value": ""},
            {"op": "move", "from": "/choices/0/message/content", "path": "/text"}
          ]  
  - name: send-webhook
    destination:
      type: webhook
      version: v1
      parameters:
        url: $SLACK_WEBHOOK

For a full list of inference examples see:

Golioth for AI marks a major step forward in integrating AI with IoT. This powerful collection of features is the culmination of our relentless innovation in device management and data routing, now unlocking advanced AI capabilities like never before. Whether you’re an AI expert or just starting your AI journey, our platform provides the infrastructure seamlessly train, deploy, and manage AI models with IoT data.

We’ve assembled a set of exciting examples to showcase how these features work together, making it easier than ever to achieve advanced AI integration with IoT. We can’t wait to see the AI innovations you’ll create using Golioth.

For detailed information and to get started, visit our documentation and explore our examples on GitHub.

Thank you for joining us on this exciting journey. Stay tuned for more updates and build on!

New Pipelines Transformer: Base64

A new Pipelines transformer for Base64 encoding and decoding is now generally available for Golioth users. The base64 transformer is useful for working with sources and destinations in which working with binary data is difficult or unsupported.

How It Works

By default, the base64 transformer encodes the message payload as Base64 data. In the following example, the data is delivered to the recently announced aws-s3 data destination after being encoded. The content type following encoding will be text/plain.

filter:
  path: "*"
steps:
  - name: step0
    transformer:
      type: base64
    destination:
      type: aws-s3
      version: v1
      parameters:
        name: my-bucket
        access_key: $AWS_ACCESS_KEY
        access_secret: $AWS_ACCESS_SECRET
        region: us-east-1

Click here to use this pipeline in your Golioth project!

Supplying the decode: true parameter will result in the base64 transformer decoding data rather than encoding. In the following example, Base64 data is decoded before being delivered to the recently announced kafka data destination. The content type following decoding will be application/octet-stream.

filter:
  path: "*"
steps:
  - name: step0
    transformer:
      type: base64
      parameters:
        decode: true
    destination:
      type: kafka
      version: v1
      parameters:
        brokers:
          - my.kafka.broker.com:9092
        topic: my-topic
        username: pipelines-user
        password: $KAKFA_PASSWORD
        sasl_mechanism: PLAIN

Click here to use this pipeline in your Golioth project!

For more details on the base64 transformer, go to the documentation.

What’s Next

The base64 transformer has already proved useful for presenting binary data in a human-readable format. However, it becomes even more useful when paired with other transformers, some of which we’ll be announcing in the coming days. In the mean time, share how you are using Pipelines on the forum and let us know if you have a use-case that is not currently well-supported!

New Pipelines Data Destination: Kafka

A new Pipelines data destination for Kafka is now generally available for Golioth users. Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. There are many cloud-hosted services offering Kafka or Kafka-compatible APIs.

How It Works

Similar to the existing gcp-pubsub and azure-event-hubs destinations, the kafka data destination publishes events to the specified topic. Multiple brokers can be configured, and PLAIN, SCRAM-SHA-256, SCRAM-SHA-512 SASL mechanisms are supported for authentication. All data in-transit is encrypted with Transport Level Security (TLS).

Data of any content type can be delivered to the kafka destination. Metadata, such as Golioth device ID and project ID, will be included in the event metadata, and the event timestamp will match that of the data message.

filter:
  path: "*"
steps:
  - name: step0
    destination:
      type: kafka
      version: v1
      parameters:
        brokers:
          - my-kafka-cluster.com:9092
        username: kafka-user
        password: $KAFKA_PASSWORD
        topic: my-topic
        sasl_mechanism: SCRAM-SHA-256

Click here to use this pipeline in your Golioth project!

For more details on the kafka data destination, go to the documentation.

What’s Next

Kafka has been one of the most requested data destinations by Golioth users, and we are excited to see all of the new platforms that are able to be leveraged as part of the integration. Keep an eye on the Golioth blog for more upcoming Pipelines features, and reach out on the forum if you have a use-case that is not currently well-supported!

New Pipelines Transformer: Struct to JSON

A new Struct-to-JSON Pipelines transformer is now generally available to Golioth users. This transformer takes structured binary data, such as that created using packed C structs, and converts it into JSON according to a user provided schema. To see this transformer in action, check out the example in our documentation or watch our recent Hackday recap stream.

Compact and Low Overhead

Historically, many IoT devices have sent structured binary data to cloud applications, which then needed to unpack that data and serialize it into a format that downstream services can understand. It’s easier and faster to populate a struct than it is to serialize data into a common format in C, and the data is often significantly more compact, which is especially important for devices using constrained transports such as cellular, Bluetooth, or LoRA. As embedded devices have gotten more powerful and efficient, many of these devices have started using serialization formats like CBOR, JSON, or Protocol Buffers, which are easier for cloud applications to work with. These serialization formats have significant advantages over packed C structs in terms of flexibility and interoperability, and for most cases that’s what we recommend our customers use.

But there are still cases where sending raw binary data could make sense. If the structure of your data is well-defined, a widely accepted standard, and/or unlikely to change, the overhead of a flexible serialization format may not yield any practical advantages. If it’s important to squeeze every single byte out of a low bandwidth link, it’s hard to beat a packed C struct. If you’re working with legacy systems where updating them to use a new serialization format involves all kinds of dependencies, it may simply be easier to stick with the status quo. Golioth now supports these use cases through the new Struct-to-JSON transformer.

To use the transformer, you need to describe the structure of the data in your Pipeline YAML. The transformer currently supports standard integer sizes from 8 to 64 bits (signed and unsigned), single and double precision floating point numbers, and fixed- and variable-length strings. Here’s an example of setting up the transformer for a float and a couple strings:

transformer:
  type: struct-to-json
  version: v1
  parameters:
    members:
      - name: temperature
        type: float
      - name: string1
        type: string
        length: 5
      - name: string2_len
        type: u8
      - name: string2
        type: string
        length: string2_len

We can create a packed struct in C that matches the schema:

struct my_struct {
    float temperature;
    char string1[5];
    uint8_t string2_len;
    char string2[];
} __attribute__((packed));

struct my_struct *s = malloc(sizeof(struct my_struct) + strlen("Golioth!"));

s->temperature = 23.5f;
s->string2_len = strlen("Golioth!");
memcpy(s->string1, "hello", strlen("hello"));
memcpy(s->string2, "Golioth!", strlen("Golioth!"));

Sending that struct through the transformer gives us the following JSON:

{
  "temperature": 23.5,
  "string1": "Hello",
  "string2_len": 8,
  "string2": "Golioth!"
}

Monitoring Heap Usage with mallinfo()

Monitoring heap usage is a great way to gain insight into the performance of your embedded device. Many C standard library implementations provide a mallinfo() (or its newer analogue mallinfo2()) API that returns information about the current state of the heap. As this data is in a well-defined, standard format and unlikely to change, it’s a good candidate for sending directly to Golioth without any additional serialization. We can use the following pipeline to transform the struct returned by a call to mallinfo2() on a 64-bit Linux system to JSON and send it to LightDB Stream:

filter:
  path: "/mallinfo"
  content_type: application/octet-stream
steps:
  - name: step0
    transformer:
      type: struct-to-json
      version: v1
      parameters:
        members:
          - name: arena
            type: u64
          - name: ordblks
            type: u64
          - name: smblks
            type: u64
          - name: hblks
            type: u64
          - name: hblkhd
            type: u64
          - name: usmblks
            type: u64
          - name: fsmblks
            type: u64
          - name: uordblks
            type: u64
          - name: fordblks
            type: u64
          - name: keepcost
            type: u64
    destination:
      type: lightdb-stream
      version: v1

Click here to use this Pipeline.

See it in action

What’s Next

Stay tuned for additional Pipelines transformers and destinations to be released in the coming weeks. If you have a use-case that is not currently well supported by Pipelines, or an idea for a new transformer or destination, please reach out on the forum!

New Pipelines Data Destination: AWS S3

A new Pipelines data destination for AWS S3 is now generally available for Golioth users. It represents the first object storage destination for Golioth Pipelines, and opens up a new set of data streaming use-cases, such as images and audio. It also can be useful for scenarios in which large volumes of data are collected then batch processed at a later time.

How It Works

The aws-s3 data destination uploads events routed to it as objects in the specified bucket. The name of an object corresponds to its event ID, and objects are organized in directories for each device ID.

/
├─ 664b9e889a9590ccfcf822b3/
│  ├─ 28ebd981-80ae-467f-b700-ba00e7c1e3ee
│  ├─ e47e5b46-d4e3-4bf1-a413-9fc71ec9f6b0
│  ├─ ...
├─ 66632a45658c93af0895a70e/
├─ .../

Data of any content type, including the aforementioned media use-cases and more traditional structured sensor data, can be routed to the aws-s3 data destination. To authenticate, an IAM access key and secret key must be created as secrets, then referenced in the pipeline configuration. It is recommended to limit the permissions of the IAM user to PutObject for the specified bucket.

filter:
  path: "*"
steps:
  - name: step0
    destination:
      type: aws-s3
      version: v1
      parameters:
        name: my-bucket
        access_key: $AWS_ACCESS_KEY
        access_secret: $AWS_SECRET_KEY
        region: us-east-1

Click here to use this pipeline in your Golioth project!

For more details on the aws-s3 data destination, go to the documentation.

What’s Next

While any existing uses of the Golioth Firmware SDK that leverage data streaming can be used with the aws-s3 data destination, we’ll be introducing examples to demonstrate new use-cases in the coming weeks. Also, stay tuned for more object storage data destinations, and reach out on the forum if you have a use-case that is not currently well-supported by Pipelines!

IoT is all about data. How you choose to handle sending that data over the network can have a large impact on your bandwidth and power budgets. Golioth includes the ability to batch upload streaming data, which is great for cached readings that allows your device to stay in low power mode for more of the time. Today I’ll detail how to send IoT data in batches.

What is Batch Data?

Batch data simply means one payload that encompasses multiple sensors readings.

[
    {
        "ts": 1719592181,
        "counter": 330
    },
    {
        "ts": 1719592186,
        "counter": 331
    },
    {
        "ts": 1719592191,
        "counter": 332
    }
]

The example above shows three readings, each passing a counter value the represents a sensor reading, along with a timestamp for when that reading was taken.

Sending Batch Data from an IoT Device

The sample firmware can be found at the end of the post, but generally speaking, the device doesn’t need to do anything different to send batch data. The key is to format the data as a list of readings, whether you’re sending JSON or CBOR.

int err = golioth_stream_set_async(client,
                                   "",
                                   GOLIOTH_CONTENT_TYPE_JSON,
                                   buf,
                                   strlen(buf),
                                   async_push_handler,
                                   NULL);

We call the Stream data API above. The client and data type are passed as the first two arguments, then the buffer holding the data and the buffer length are supplied. The last two parameters are a callback function and an optional user data pointer.

Routing Batch Data using a Golioth Pipeline

Batch data will be automatically sorted out by the Golioth servers based on the pipeline you use.

filter:
  path: "*"
  content_type: application/json
steps:
  - name: step0
    destination:
      type: batch
      version: v1
  - name: step1
    destination:
      type: lightdb-stream
      version: v1

This example pipeline listens for JSON data coming in on any stream path. In step0 it “unpacks” the batch data into individual readings. In step1 the individual readings are routed to Golioth’s LightDB stream. Here’s what that looks like:

Note that all three readings are coming in with the same server-side timestamp. The device timestamp is preserved in the data, but you can also use Pipelines to tell Golioth to use the embedded timestamps.

Batch Data with Timestamp Extract

For this example we’re using a very similar pipeline, with one additional transformer to extract the timestamp from the readings and use it as the LightDB Stream timestamp.

filter:
  path: "*"
  content_type: application/json
steps:
  - name: step0
    destination:
      type: batch
      version: v1
  - name: step1
    transformer:
      type: extract-timestamp
      version: v1
    destination:
      type: lightdb-stream
      version: v1

Note that we didn’t even need an additional step, but simply added the transformer to the step that already set lightdb-stream as the destination.

You can see that the Linux epoch formatted timestamp has been popped out of the data and assigned to the LightDB timestamp. Extracting the timestamp is not unique to Golioth’s LightDB Stream service.

Streaming data may be routed anywhere you want it. For instance, if you wanted to send your data to a webhook, just use the webhook destination. If you included the extract-timestamp transformer, you data will arrive at the webhook with the timestamps from your device as part of the metadata instead of nested in the JSON.object.

Using a Special Path for Batch Data

What happens if your app wants to send other types of streaming data beyond batch data? The batch destination will automatically drop data that isn’t a list of data objects. But you might like to be more explicit about where you send data and for that you can easily create a path to receive batch data.

filter:
  path: "/batch/"
  content_type: application/json
steps:
  - name: step0
    destination:
      type: batch
      version: v1
  - name: step1
    transformer:
      type: extract-timestamp
      version: v1
    destination:
      type: lightdb-stream
      version: v1

This pipeline is nearly the same as before with the only change on line 2 where the * wildcard was removed from path and replaced with "/batch/". Now we can update the API call in the device firmware to target that path:

int err = golioth_stream_set_async(client,
                                   "batch",
                                   GOLIOTH_CONTENT_TYPE_JSON,
                                   buf,
                                   strlen(buf),
                                   async_push_handler,
                                   NULL);

Although the result hasn’t changed, this does make the intent of the firmware more clear, and it differentiates the intent of this pipeline from others.

Sample Firmware

This is a quick sample firmware I made to use while writing this post. It targets the nrf9160dk. One major caveat is that the function that pulls time from the cellular network is quite rudimentary and should be replaced on anything that you plan to use in production.

To try it out, start from the Golioth Hello sample and replace the main.c file. This post was written using v0.14.0 of the Golioth Firmware SDK.

Wrapping Up

Batch data upload is a common request in the IoT realm. Golioth has not only the ability to sort out your batch data uploads, but to route them where you want and even to transform that data as needed. If you want to know more about what Pipelines brings to the party, check out the Pipelines announcement post.

A new Pipelines data destination for Memfault is now generally available for Golioth users. It enables devices to leverage their existing secure connection to Golioth to deliver data containing coredumps, heartbeats, events, and more to Memfault. To see this integration in action, check out the example firmware application, read the documentation, and tune into this week’s Friday Afternoon Stream, where we will be joined by a special guest from Memfault.

Golioth + Memfault

We have long been impressed with the functionality offered by Memfault’s platform, as well as the embedded developer community they have cultivated with the Interrupt blog. In our mission to build a platform that makes connecting constrained devices to the cloud simple, secure, and efficient, we have continuously expanded the set of cloud services that devices can target. This goal has been furthered by the recent launch of Golioth Pipelines.

Memfault’s observability features are highly desired by embedded developers, but leveraging them has typically required establishing a separate HTTP connection from a device to Memfault’s cloud, building custom functionality to relay data from an existing device service to Memfault, or utilizing an intermediate gateway device to provide connectivity. With Golioth, devices already have a secure connection to the cloud for table-stakes device management services and flexible data routing. By adding a Memfault data destination to Golioth Pipelines, that same connection can be used to route a subset of streaming data to Memfault. Leveraging this existing connection saves power and bandwidth on the device, and removes the need to store fleet-wide secrets on deployed devices.

How It Works

The Memfault Firmware SDK provides observability data to an application serialized in the form of chunks. An application can periodically query the packetizer to see if there are new chunks available.

bool data_available = memfault_packetizer_begin(&cfg, &metadata);

When data is available, it can be obtained from thepacketizer by either obtaining a single chunk via memfault_packetizer_get_chunk, or by setting enable_multi_packet_chunk to true in configuration and repeatedly invoking memfault_packetizer_get_next  until a kMemfaultPacketizerStatus_EndOfChunk status is returned. The latter strategy allows for obtaining all data in a single chunk that would exceed the default size limitations. Golioth leverages this functionality to upload both large and small chunks using CoAP blockwise transfers, a feature that was enabled in our recent v0.14.0 Golioth Firmware SDK release.

golioth_stream_set_blockwise_sync(client,
                                  "mflt",
                                  GOLIOTH_CONTENT_TYPE_OCTET_STREAM,
                                  read_memfault_chunk,
                                  NULL);

The read_memfault_chunk callback will be called repeatedly to populate blocks for upload until the entire chunk has been obtained from the packetizer.

static enum golioth_status read_memfault_chunk(uint32_t block_idx,
                                               uint8_t *block_buffer,
                                               size_t *block_size,
                                               bool *is_last,
                                               void *arg)
{
    eMemfaultPacketizerStatus mflt_status;
    mflt_status = memfault_packetizer_get_next(block_buffer, block_size);
    if (kMemfaultPacketizerStatus_NoMoreData == mflt_status)
    {
        LOG_WRN("Unexpected end of Memfault data");
        *block_size = 0;
        *is_last = true;
    }
    else if (kMemfaultPacketizerStatus_EndOfChunk == mflt_status)
    {
        /* Last block */
        *is_last = true;
    }
    else if (kMemfaultPacketizerStatus_MoreDataForChunk == mflt_status)
    {
        *is_last = false;
    }

    return GOLIOTH_OK;
}

Golioth views the data as any other stream data, which can be delivered to a path of the user’s choosing. In this case, the data is being streamed to the /mflt path, which can be used as a filter in a pipeline.

filter:
  path: "/mflt"
  content_type: application/octet-stream
steps:
  - name: step0
    destination:
      type: memfault
      version: v1
      parameters:
        project_key: $MEMFAULT_PROJECT_KEY

Click here to use this pipeline in your Golioth project!

Because the Memfault Firmware SDK is producing this data, it does not need to be transformed prior to delivery to Memfault’s cloud. Creating the pipeline shown above, as well as a secret with name MEMFAULT_PROJECT_KEY that contains a project key for the desired Memfault project, will result in all streaming data on the /mflt path to be delivered to the Memfault platform.

Livestream demo with Memfault

Dan from Golioth did a livestream with Noah from Memfault showcasing how this interaction works, check it out below:

What’s Next

We will be continuing to roll out more Pipelines data destinations and transformers in the coming weeks. If you have a use-case in mind, feel free to reach out to us on the forum!

Piecing together different pieces of technology can have a multiplicative effect. I think that’s what happened with this demo: we paired Wi-Fi locationing, low cost hardware, Golioth Pipelines, and n8n (an API workflow tool) to create a “geofence”.

A geofence is a virtual perimeter used to set up alerts or take actions once a device moves outside that virtual perimeter. The example we gave in the video is if you had a tracker on your cat and you wanted to take an action once the device was outside a particular area.

Hardware

The reason we’re calling this a “$2 geofence” is because it’s enabled by the ESP32-C3, a low cost module from Espressif. We put this on the Aludel Elixir as a backup connectivity method if we were again at a conference with no LTE-M coverage.

The ESP-AT firmware does what it sounds like it should do: it responds to AT commands from other microcontrollers talking to it over serial (as many cellular modules also do). One key enhancement is that the ESP-AT mode already works as a connectivity method; in fact, we utilize the ESP-AT firmware as an offloaded Wi-Fi modem when we build and test for the nRF52840 in our Continuously Verified Hardware. In Zephyr, there is an option for utilizing the ESP-AT modem as the main offloaded Wi-Fi modem. This makes it ‘invisible’ to the Zephyr program and acts like any other network interface, since it is built on top of the Wi-Fi subsystem in Zephyr.

One change that was required is we had to re-write how we pulled the information off the ESP-AT modem. Normally the wifi scan shell command returns the (human readable) names and signal strengths of all the access points (APs) visible to the modem. Instead, we want mac address and signal strength, as that’s what’s expected by the API service we’ll describe below.

Golioth Pipeline

We start by scanning Wi-Fi APs and the tower that the cell modem is connected to. Then we publish that on the Stream service up to the Golioth cloud. Because we’re publishing to a specific topic (instead of my normal, generic default of “sensor”), we can start to peel off that data and send it somewhere interesting. How? With pipelines, of course!

I set up the pipeline to watch on the path wifi_lte_loc_req (a name of my own making, this could be any arbitrary name). That data gets sent out to a webhook going to n8n. Webhooks more broadly are a generic way to interface between a lot of cloud services, but we use it to send data into the api platform.

n8n

Now that the data is being sent into n8n (a self hosted instance, no less!) we can start doing interesting things with it. This is an area that is full of similar offerings, sometimes specifically targeted at IoT, and other time targeted a business workflows:

If you’re newer to working with APIs and tying stuff together, it might take a bit of time to figure out how queries should be structured and how your setup should respond when there are errors.

API service

We send data from the device to Golioth already formatted for what the location service API service expects. This is not required in the slightest, as Golioth’s Pipelines can morph and transform data to meet the needs of the endpoint. But…why not? It kind of makes sense to have the device publish data in a format that matches the target API service. Then later if we decide to re-target an alternative service, we can use transformations to mold the incoming data to what that new service expects.

For this demo, I’m using the here.com API service. I like that it combines LTE tower + WiFi AP for its API, which means it will lean on whichever provides a more accurate reading (normally Wi-Fi). Again, this service is one of many! There are a range of API services because this is something that phones are often using to determine location from apps.

Once we receive the lat, lon, and accuracy, we actually pass the data back to the device using LightDB State. This two-sided database is a good defacto way to send arbitrary data from the cloud to the device. In the case of n8n, we’re pulling through the original project name, device identifier, and then publishing to the Golioth REST API. This makes it a data “round trip” from device to cloud and back down to device.

Logic and alerts

Since the data is already on the cloud in an API marketplace like n8n…why not use that data to do some cloud side processing? In this case, I wanted to set up a geofence to show that we can trigger logic and alerts on the cloud and even call 3rd party APIs like Slack and Twilio.

Geofence alert messages being sent into Slack

I asked ChatGPT to help me out with some javascript that would help calculate a true/false output so that I could use that to trigger downstream logic. We insert the lat/lon data that was returned from here.com into this algorithm and it pops out whether or not we are inside the “fence”. As of this writing, I am still using a fixed location for where the center of the “fence” is located, as well as the radius of said “fence”. I’m certain it’s possible in n8n or other tools, perhaps as another Webhook or a configurable variable.

Future demos

Hopefully one thing you noticed from this demo is just how much can be enabled with Golioth’s pipelines. Since Golioth takes care of reliably delivering your data to the cloud, the rest is really a matter of configuration. It’s also difficult to know all the different APIs that could be utilized out in the world. Pulling these elements together shows how a hardware or firmware engineer could enact complex device and business logic to create interesting applications out in the real world. If you need any help getting your next project off the ground, stop by our forum!