In a best case scenario, once an IoT fleet is deployed, you never need to (physically) touch them again. Golioth helps give you tools to work with your devices remotely and make this a reality. Today, we’ll look at dynamically modifying the number of logs being sent back to the Cloud. This allows fleet managers to peek into individual devices without needing to waste data and battery power by always sending every log message back to Golioth.

Logging to the cloud is already built into Golioth, so it’s really just a matter of tuning how many logs are being sent by your devices. Golioth hooks into the Zephyr RTOS Logging service, which we’ll be showcasing here today.

Background on Remote Logging with Zephyr

The Golioth Zephyr SDK has remote logging built it, and in our sample applications (like the hello sample) it is enabled by default in the prj.conf files:


At the top of each C file you need to register for logging. This is also a good place to set the default logging level, which I’ll refer to as the “compiled-in” logging level.

#include <zephyr/logging/log.h>
LOG_MODULE_REGISTER(remote_logging_example, LOG_LEVEL_DBG);

This compiled-in level is an important decision because the preprocessor will not include logging calls if they have a higher value than this parameter. For instance, if you set the default to LOG_LEVEL_ERR, you cannot remotely turn on debugging messages because LOG_LEVEL_DBG is a higher logging level. Here is the hierarchy of logging levels in Zephyr:

#define LOG_LEVEL_ERR  1U
#define LOG_LEVEL_WRN  2U
#define LOG_LEVEL_INF  3U
#define LOG_LEVEL_DBG  4U

In our case, the solution is to use LOG_LEVEL_DBG when compiling, and programmatically set the value to a lower level at run time. This will make your binary a bit larger, since the strings for every logging message will be included, but it delivers the option to turn on debugging messages after deployment which in most cases is worth the extra bytes of flash.

Write a Function to Set Log Levels at Run Time

Now we need a function that, when called, will automatically adjust the logging level for every logging source on the device.

#include <zephyr/logging/log_ctrl.h>

void change_logging_level(int log_level) {
    int source_id = 0;
    char *source_name;
    while(1) {
        source_name = (char *)log_source_name_get(0, source_id);
        if (source_name == NULL) {
        } else {
            LOG_WRN("Settings %s log level to: %d", source_name, log_level);
            log_filter_set(NULL, 0, source_id, log_level);

This function uses Zephyr’s logging control to query the name of each logging source that is available on the system. It then uses that source name to set the new logging level.

If you are putting cellular devices into the field, you probably don’t want to have logging turned on very high (or at all) by default since you’ll be paying for bandwidth. The first thing you can do at run time is call this function:


Now, only error messages will be logged to the Golioth cloud.

Setting Log Level Remotely

There are two obvious ways you can go about setting log levels remotely: the Golioth Remote Procedure Call (RPC) service, and the Golioth Settings service. Since the most likely use for this is turning on logs for a single device (and not fleetwide all at the same time), using a remote procedure call makes the most sense to me.

// Remote procedure call (RPC) callback for setting the logging levels
static enum golioth_rpc_status on_set_log_level(QCBORDecodeContext *request_params_array,
                       QCBOREncodeContext *response_detail_map,
                       void *callback_arg)
    double a;
    uint32_t log_level;
    QCBORError qerr;

    QCBORDecode_GetDouble(request_params_array, &a);
    qerr = QCBORDecode_GetError(request_params_array);
    if (qerr != QCBOR_SUCCESS) {
        LOG_ERR("Failed to decode array item: %d (%s)", qerr, qcbor_err_to_str(qerr));

    log_level = (uint32_t)a;

    if ((log_level < 0) || (log_level > LOG_LEVEL_DBG)) {

        LOG_ERR("Requested log level is out of bounds: %d", log_level);

    return GOLIOTH_RPC_OK;

// Register RPC to listen for "set_log_levels"
// This should be called from your "on_connect()" callback
golioth_rpc_register(rpc_client, "set_log_level", on_set_log_level, NULL);

There are two parts to the function above. The first is a callback that will run when a remote procedure call (RPC) instruction is received from the Golioth servers. It will get incoming log level as a parameter, validate it, then run the function we discussed in the previous section to change the log levels.

The second part of the code is the act of registering the RPC. This tells the Golioth servers that this device wants to be notified whenever a callback with the name "set_log_level" is issued from the Golioth web console, or via the Golioth REST API.

Golioth RPC for setting logging levels

The Golioth web console can be used to send an RPC, or you may do so using the Golioth REST API

Here’s an example of using the web console interface to submit this RPC. I sent a request to change to log level 3 (LOG_LEVEL_INF), and upon success/failure we get a notification message back. This RPC was successful, and took 913 milliseconds for the device to receive the message, execute it, and report the results.

[00:00:37.736,663] <inf> app_work: Sending hello! 2
[00:01:07.738,800] <inf> app_work: Sending hello! 3
[00:01:36.947,418] <wrn> app_rpc: Settings golioth_dfu log level to: 3
--- 12 messages dropped ---
[00:01:36.947,448] <wrn> app_rpc: Settings golioth_rd_template log level to: 3
[00:01:36.947,479] <wrn> app_rpc: Settings golioth_samples log level to: 3
[00:01:36.947,479] <wrn> app_rpc: Settings golioth_system log level to: 3
[00:01:36.947,509] <wrn> app_rpc: Settings lightdb log level to: 3
[00:01:36.947,540] <wrn> app_rpc: Settings log log level to: 3
[00:01:36.947,570] <wrn> app_rpc: Settings lte_lc log level to: 3
[00:01:36.947,601] <wrn> app_rpc: Settings lte_lc_helpers log level to: 3
[00:01:36.947,631] <wrn> app_rpc: Settings mcuboot_util log level to: 3
[00:01:36.947,662] <wrn> app_rpc: Settings modem_antenna log level to: 3
[00:01:36.947,692] <wrn> app_rpc: Settings mpu log level to: 3
[00:01:36.947,723] <wrn> app_rpc: Settings net_buf log level to: 3
[00:01:36.947,753] <wrn> app_rpc: Settings net_coap log level to: 3
[00:01:36.947,753] <wrn> app_rpc: Settings net_core log level to: 3
[00:01:36.947,784] <wrn> app_rpc: Settings net_if log level to: 3
[00:01:36.947,814] <wrn> app_rpc: Settings net_shell log level to: 3
[00:01:36.947,845] <wrn> app_rpc: Settings net_sock log level to: 3
[00:01:36.947,875] <wrn> app_rpc: Settings net_sock_addr log level to: 3
[00:01:36.947,906] <wrn> app_rpc: Settings net_sock_tls log level to: 3
[00:01:36.947,937] <wrn> app_rpc: Settings net_sock_wrapper log level to: 3
[00:01:36.947,967] <wrn> app_rpc: Settings net_socket_offload log level to: 3
[00:01:36.947,998] <wrn> app_rpc: Settings net_utils log level to: 3
[00:01:36.948,028] <wrn> app_rpc: Settings nrf_modem log level to: 3
[00:01:36.948,059] <wrn> app_rpc: Settings os log level to: 3
[00:01:36.948,089] <wrn> app_rpc: Settings pm log level to: 3
[00:01:36.948,120] <wrn> app_rpc: Settings settings log level to: 3
[00:01:36.948,150] <wrn> app_rpc: Settings shell.shell_uart log level to: 3
[00:01:36.948,181] <wrn> app_rpc: Settings shell_uart log level to: 3
[00:01:36.948,211] <wrn> app_rpc: Settings soc log level to: 3
[00:01:36.948,242] <wrn> app_rpc: Settings stream log level to: 3
[00:01:36.948,272] <wrn> app_rpc: Settings uart_nrfx_uarte log level to: 3
[00:01:37.741,027] <inf> app_work: Sending hello! 4
[00:02:07.743,041] <inf> app_work: Sending hello! 5

On the device side, we can see the output of our RPC on a serial terminal (above). There are a lot of logging sources running on this device and they have all been individually set to level 3.

Remember, if the compiled-in level for any given source has been set lower (to only show errors, or to show no logging), setting a higher number at run time will not return additional messages because higher-level messages were not included in the build.

There and Back Again

What does it take to troubleshoot an IoT device in the field? If designed correctly, it will not take a physical visit to the device, but merely a few remote communications. Ideally, you will turn on debugging, analyze the issues you’re having, and then send a command to adjust accordingly.

But even if you haven’t planned very far ahead, with Golioth you can still enable these features. We recommend that every device you put in the field have Golioth Over-the-Air (OTA) firmware updates enabled. That way, you can send these remote-logging features to your devices, even if they are already in the field.

Do you have questions or suggestions on adjusting logging levels remotely? We’d love to hear from you on the Golioth Forum!

One of my first engineering jobs was in the Test and Measurement space. Tracing everything back to standards and calibration is a key part of the process. It takes a long time and is taken very seriously. I had many learning experiences that reinforced the importance of calibration (despite my tongue-in-cheek article title).

IoT devices often don’t have the luxury of being as accurate: the cost of sensors, the power usage of analog measurements, the “awake” windows for battery power devices… all of these things contribute to different priorities while measuring the physical world. However, if you have a precise (repeatable) sensor, you can utilize the trend of the data in a useful way.

If you ARE going to calibrate, it’s normally done in a sensor library. For certain sensors, Zephyr excels at this. There are built in sensor libraries that return standard values. You can even pull calibrated and normalized readings from the sensor in real time from Zephyr’s sensor shell. I consistently use sensors like the BME280 and the LIS2DH in my reference designs since it is really easy to add the readings to the pile of data I send back to Golioth.

What happens when a sensor driver is not in tree and I don’t want to go about writing my own sensor driver? I pull the raw readings to the cloud and work with them there!

Prototyping, not production

When you are trying to prove your IoT system can work or test out a business idea, you don’t always start by making production ready designs. But you can still make useful designs by pulling raw data readings.

I did this recently with an i2c sensor that was pulling in ADC readings. The “counts” that I gathered from the ADC are tied to physical phenomenon (in this case, soil moisture), but they are not absolute values. Instead I am able to gather the readings using direct ADC readings and then publish them to Golioth’s LightDB Stream service.

Below, I will discuss three ways you can interact with this data using other Golioth services and interactively produce even more useful data. These include tweaking threshold settings, calibrating via remote procedure call (RPC), and using OTA to upload new machine learning sets.

Set “thresholds” with Settings

One of my favorite things to use the Golioth Settings Service for is interacting with data and creating an extra layer of intelligence. For the IoT Trashcan Monitor reference design, I do this to set “threshold” on each device in the field. There is a time-of-flight sensor that gathers the distance from the top of the trashcan to the top of the garbage in that trashcan. What happens if you want to utilize the same sensor but on different sized trashcans? What if you want to set the “100%” level of the trashcan differently, say if one part of the national park needs to have a cleaner look than another?

You could keep this intelligence on the Cloud, but then you don’t get the benefit of the device reporting it’s various levels, so it’s harder to read from a terminal or in the logs. I push a level I created on the Settings Service down to each device that defines different thresholds:

I also utilize these levels coming back from each trashcan to trigger icons on our Grafana dashboards, which makes it even easier to tell which devices require intervention.

The above is just for the Trashcan example, there are loads of other examples where you might want to have field-settable values, from the installer or technician. In the case of the soil moisture sensor, I want to be able to calibrate “wet” and “dry” (per the simplistic calibration instructions) and then do some interpolation in between.

On the fly modifications with Remote Procedure Calls

Most of the time when using raw data from a sensor, it is done in a “device to cloud” context. You take the reading from the sensor and ship it off to be dealt with by larger computers (AKA “the cloud”). However, some applications will include the need for some kind of feedback from the cloud computing element. You could argue this is what we were doing above, since the Settings Service is pushing data down to the device.

Sometimes you want to be able to inject data into your data measurement and management process on the device, which is a perfect use case for Remote Procedure Calls (RPCs). One way to think of RPCs is if you were accessing a function between two different parts of your code. You put the function prototype in your code.h file… except now you can access that function from the cloud. So maybe I don’t want to do a full calibration on the device, but it is beneficial to set an offset for something like an i2c-based thermocouple measurement chip like the MCP96L01T. I could easily pipe raw data from the chip up to the cloud, but I might want to change a setting for the resolution of that data or the cold junction compensation temperature.

Registers on the MCP960X family of parts. You could write a function to change these values with raw i2c writes to the chip and have the function be accessible via a Golioth RPC.

I could have a function on the device that I use during start that sets these values like set_thermocouple_resolution() and set_cold_junction_temp_c(), which I use during startup. Under the covers they would be simple i2c writes to the device to set registers with bit-masked values. However, I could also expose these to the cloud using an RPC. When I call that RPC from the REST API or the Golioth Console, I also pass a value (in this case a new resolution setting or cold junction temp), it gets validated on the device as an acceptable value, and then a success message is sent back to the cloud once its executed. Add in logging messages into the device-side code, and you should be able to easily see that the device has successfully switched modes and the raw data being piped back is now different. (The change means a higher resolution or a different cold junction compensation.)

Machine Learning + data capture + OTA

The ultimate (and most trendy) way to deal with sensor data is to not care at all about the sensor output. Instead, capture and correlate data with desired behaviors; collect data with a “known good” and a “known bad” state of your machine, for instance, to allow the model to discern between the two.

Golioth has the tools to enhance this method of working with data, on a local machine or from afar. First, you can capture general data using LightDB stream, sending back your raw data readings from your sensor. If desired, you could also link a button, switch, or other input on the device to correlate when you are performing “Action A” that is different from “Action B”. Next, you capture and ingest that data into a machine learning algorithm like PyTorch or TinyML. Finally, when you create or revise your model and build it into your design, you can upload that new firmware using the Golioth Over-The-Air (OTA) service. Over time, you can continue to refine the model and even combine it with the other methods mentioned above.

Start prototyping today!

One thing I hope you get out of reading this article is the prototyping capabilities available to you when you use Golioth. While I benefit from the Zephyr driver ecosystem (as well as driver ecosystems from other SDKs that we support), I don’t want to feel limited when I want to try out a new chip that isn’t in-tree yet. I feel comfortable doing things like raw i2c and SPI read/write functions, and now can make that data available to the Cloud for even faster prototyping. If you need help prototyping your next IoT device, stop by our Forum to discuss or send an email to [email protected].

Cellular IoT products struggle with battery life. Getting and staying connected to a cell tower takes a good amount of energy. Though we’re past the days of GSM drawing amps of current (!), it is still costly to open sessions to the tower. Understanding how your device is communicating with the Cloud is crucial to building robust devices that will last years in the field.

This webinar will include Golioth team members, alongside Jared Wolff of CircuitDojo. Jared is an early adopter of Golioth and a Golioth Ambassador, as well as an advocate for Zephyr devices. Golioth utilizes Jared’s nRF9160 Feather board designs in all of our current cellular-based Reference Designs.

What you can expect from this Webinar

First and foremost, we hope to make this a more dynamic and interactive session than many technical webinars. (No one will be reading Powerpoint slides in a monotone voice here!) The session will cover:

  • Understanding your connection to cellular towers
  • Understanding your power draw when in a passive or sleep mode
  • Measurement challenges for embedded cellular projects
  • Methods for saving data (and power) when connecting to the cloud
  • Building robust device health metrics for your fleet
  • System architecture decisions for lower power circuit boards

Sign up now!

This webinar is at 1 pm EST / 10 am PST on January 18th, 2023. If you can’t make it the day of, you can still still sign up to access the on-demand content. Those who attend will have an opportunity to ask questions towards the end of the presentation.

Luis Ubieda is the Lead Firmware Engineer at Croxel. He has a background in Electrical Engineering and is passionate about Electronics, Embedded Systems, and IoT Technology.

Embedded systems are riddled with complexity, mainly because they are at the intersection of expertise. Problem are often a mixture of software, electrical and mechanical issues. This is the case even for seemingly simple tasks, such as reliably detecting a button-press.

“Wait — a button-press??”

Let’s think for a second, how a button press is detected:

  1. Initial State: a button-press typically consists of a “normally-open switch”, which through a pull-up/down resistor, is normally “high” or normally ‘low’: this is the initial state.
  2. Press Event: then, when the button is pressed, the switch is closed and the signal transitions towards the opposite state (e.g, low for normally “high” state), and it’s sustained for as long as the button is held by the user.
  3. Release Event: Finally, when the button is released, it goes back to its initial state.
diagram showing the busy electrical signal produced at the beginning of a button press

Image: Signal during a button-press/release sequence where the transitions are outlined; and the scoped signal has noise. Source: GeeksforGeeks.

In an ideal world, we could just look at the signal edges to keep track of the transitions and assume a falling edge is a “press-event” and rising edge the “release event”. In our world, these transitions are affected by electrical transients caused by the mechanical properties of the button actuator. The suppression of this noise it’s commonly called “debouncing”.

“Ok, I get it. How can we `debounce` button-presses?”

There are two main ways we can approach this: the hardware-way and the software-way. Today I’ll touch on both, and detail the technique I prefer to use when debouncing button input with the Zephyr RTOS.

The Hardware Way

The hardware-way focuses on the root cause: the electrical noise. It guarantees the digital signal won’t have such noise during transitions; and it does it through the use of low-pass filters (most probably RC-filters). There are some pretty cool articles that detail this approach (see references at the end of the article).

The Software Way

On the other hand, the software-way is about “ignoring” these false positives on the signal transitions to determine which ones are the real events we’re looking for, and which ones aren’t. Even though there are many ways implement software debouncing, there are two main approaches, depending on whether the variable of interest is the signal state or the transitions: periodic sampling and tracking edge interrupts.

A. Periodic Button State Sampling

Button sampling works by periodically acquiring the signal state, which is buffered on a continuously rolling sample-set. Through detection of consecutive states, the signal change event is detected (either pressed or released). The rule is simple: if there are X-number of consecutive samples with a changed state, we assume the transition really happened. This periodic sampling is often in the order of 10 to 25-ms and is commonly paced by a hardware timer to guarantee fixed intervals and to free some CPU usage.

diagram showing a series of 1 and 0 signals sampled to detect a button press

B. Tracking Edge-Interrupts with Minimum Cooldown

This works by coordinating the detection of edge changes with the spacing between these: there must be a minimum duration before legitimate signal transition. This cooldown phase is commonly implemented through a timer, which kicks-off on the edge-interrupts: the timer gets restarted on each transition and only when it expires (after 10-ms to 25-ms of no edge-changes), the firmware handles the transition as an event.

diagram showing the ignored edges of a button press signal

Software Approach B: Tracking Edge-Interrupts with Minimum Cooldown

In multi-threaded systems, we can leverage the use of RTOS primitives to make the code more modular while simplifying the logic to achieve the same purpose: featuring a thread and semaphores to control the state transitions and decide when to notify the user of the module when an event occurred.

Example Code – Zephyr RTOS

The following code presents an example of achieving the software approach B on Zephyr, with the following observations:

  • We’re using Zephyr GPIO interrupt APIs to keep track of the edge-changes.
  • We’re using the system workqueue as the cooldown mechanism for false positives.
  • Our button-detection module, abstracts both of these details, and only notifies the user of the relevant events: pressed or released.
  • Note: the user callback context is the workqueue handler, therefore: actions on this context shall be kept brief to allow proper functioning of other parts of the system.
#ifndef _BUTTON_H_
#define _BUTTON_H_
enum button_evt {
typedef void (*button_event_handler_t)(enum button_evt evt);
int button_init(button_event_handler_t handler);
#endif /* _BUTTON_H_ */
#include <zephyr/kernel.h>
#include <zephyr/drivers/gpio.h>
#include "button.h"
#define SW0_NODE    DT_ALIAS(sw0)
static const struct gpio_dt_spec button = GPIO_DT_SPEC_GET(SW0_NODE, gpios);
static struct gpio_callback button_cb_data;
static button_event_handler_t user_cb;
static void cooldown_expired(struct k_work *work)
   int val = gpio_pin_get_dt(&button);
   enum button_evt evt = val ? BUTTON_EVT_PRESSED : BUTTON_EVT_RELEASED;
   if (user_cb) {
static K_WORK_DELAYABLE_DEFINE(cooldown_work, cooldown_expired);
void button_pressed(const struct device *dev, struct gpio_callback *cb,
           uint32_t pins)
   k_work_reschedule(&cooldown_work, K_MSEC(15));
int button_init(button_event_handler_t handler)
   int err = -1;
   if (!handler) {
       return -EINVAL;
   user_cb = handler;
   if (!device_is_ready(button.port)) {
       return -EIO;
   err = gpio_pin_configure_dt(&button, GPIO_INPUT);
   if (err) {
       return err;
   err = gpio_pin_interrupt_configure_dt(&button, GPIO_INT_EDGE_BOTH);
   if (err) {
       return err;
   gpio_init_callback(&button_cb_data, button_pressed, BIT(;
   err = gpio_add_callback(button.port, &button_cb_data);
   if (err) {
       return err;
   return 0;

Check out the working sample code on


The most important part of debouncing inputs is to understand how far you should go and which approach best suits you. Cost sensitivity tends to favor software debouncing, whereas less CPU usage favors offloading it to the hardware approach. Like any engineering problem, there are 1000 ways to solve it: always favor the simplest (yet effective) solution that works for you.


As members and contributors to the Zephyr Project, we keep an eye on new developments. A recently published feature of particular interest because it represents a new way to structure programs and messaging between different parts of your program.

ZBus (Zephyr Bus) is a recently merged feature on the Zephyr Project, which brings a standardized version of event driven architecture in the form of a publish and subscribe model inside of your program. The lead author Rodrigo Peixoto spoke with Golioth about the details of this new feature and how it might help Golioth users to make more responsive, flexible programs. In the associated video, Rodrigo walks through the history and capabilities of ZBus.

Why should you consider an event-driven bus architecture?

The decision to take on a new system architecture is not something to do willy nilly. It’s important to understand where an event driven architecture is a good fit.

The first that I think about is scalability. When you have a bus architecture, adding an additional “listener” can be done with much less work.

Consider the alternative to event-driven architecture. When you want to add a new action (say some code that initiates a sensor reading), you need to call that new code from your trigger event (say a timer being finished). That means you need to know where the trigger is located in the code and make changes there to add the call to your new task. Once you add in the required testing, each additional feature can become burdensom. This scales poorly.

With an event-driven system, the trigger is already set up to publish an event. New tasks can be added that look for the event. You don’t need to change any trigger code, you don’t even need to know where that code lives. This performs well as the amount of data increases, which will ultimately depend on how large you think your system will be.

Flexibility is another consideration. An event-driven bus allows the system to be easily adapted to handle new events or changes in the environment. This means that the system can be easily updated and modified without having to completely rewrite large swaths of code. Another type of “flexibility” is how and where you can re-use your code. This makes it easier to develop and test your system, as well as to troubleshoot and fix any problems that may arise.

Finally, if your device needs to meet critical timing, an event-driven system will not only deal with higher levels of complexity, but also respond quickly to new inputs to the system, such as external events. For example, an embedded system might be designed to control a robot, and it would use event-driven architecture to respond to sensor data from the robot’s environment and control its movements accordingly.

How ZBus implements an event-driven architecture

In ZBus, there are “producers” that generate messages and “consumers” that act upon them. There are also “Filters” help to process raw data (such as a sensor output). Each of these are organized into different “channels” to allow listening on a particular lane of data being produced.

Source: Rodrigo’s ZBus presentation (click for link)

Source: Rodrigo’s ZBus presentation (click for link)

Source: Rodrigo’s ZBus presentation (click for link)

Each of these are built into normal scenarios such as listening for timers and taking a reading from a sensor and then alerting other parts of the program that the data is now available. A common scenario is show below:

How will you use ZBus?

Microcontrollers are used in increasingly complex scenarios and are being asked to do more and more. Connecting a low power device to the internet often requires higher levels of complexity that Zephyr helps with. We expect to see more devices using Ecosystems and RTOSes like Zephyr in the future, and implementing ZBus on high complexity devices.

Are you looking at using an event-driven architecture in your system? Let us know on our forum and tell us how we can help!

A key mission at Golioth is to make it easier for hardware and firmware developers to connect devices to the internet. We do that in two ways:

  1. Providing easy-to-use APIs and SDKs for IoT devices to connect to Golioth Cloud endpoints.
  2. Training developers how to use the Device side code.

We have done many successful training sessions so far, showing individuals and companies how to connect their first devices. Along the way, we have learned that there is a large unfulfilled need in the market for training in the IoT space. So we’re doing it again! We’ll show people how to connect devices and get access to things like:

  • Secure Over-The-Air Updates to constrained devices
  • Command and Control over remote devices
  • Learn how to create and modify settings for remote devices
  • Understand how to implement data tracking from your device

We will be running our first training open to the public on December 14th, 2022. Read more below if you’d like to take part.

Training challenges

Once again we’ll be training developers from afar. We did this back in October for a select group of hardware engineers looking to learn more about Zephyr:

Click to learn more about our experience back in October of 2022

The upcoming training will be built upon the lessons we learned during that training, and our last in-person training at the Hackaday Superconference. In both cases, we used Kasm to provide fully remote development environments so that users don’t need to install anything on their local machine (there are directions on how to do that after the training is over). We think this is an important piece to ensure people can get started quickly.

How to use Zephyr

We currently offer 3 SDKs as part of our device support, including an ESP-IDF SDK, a Modus Toolbox SDK, and a Zephyr SDK (including the Nordic Connect SDK variant). These SDKs cover a wide range of embedded hardware from different vendors.

The training includes some segments that detail how to use Zephyr, a Real Time Operating System (RTOS) that covers a wide range of different hardware platforms. We use it on many of our internal hardware reference designs at Golioth, and it was the first platform we launched. Hardware and firmware engineers who are new to Real Time Operating Systems will continue their learning journey by understanding how the RTOS connects to sensors and low level GPIO and how to manipulate different elements of the subsystems. Once a trainee understands how to get the data off of an external component (like a sensor), the Golioth Zephyr SDK makes it a simple task to forward that data along to the Golioth Cloud.

Requirements and background

We have referred to “Hardware and Firmware Engineers” in this article, because we expect that intermediate to expert level engineers will get the most out of this training. If you are brand new to understanding C or if you have never tried programming embedded hardware before, this might be a frustrating experience. If you would like some pointers to starter content that might prepare you for the training, please ask on our Forum and we will try to get you a customized list of resources that will help prepare you for future versions of this training.


  • We are not charging for this training
  • We will be capping the training at 30 people
  • All attendees will be on a first-come, first-served basis
  • Those who are accepted for this training will receive an email with more details
  • You will be expected to purchase your own hardware
    • Details will be sent with your acceptance to this training
    • Be sure you leave enough time for shipping from your local distributor
  • Signing up to take part and not attending will disqualify you from future training

Sign up here

If you have clicked “submit” and don’t see any changes, please scroll back up to the top

The Golioth Zephyr SDK is now 100% easier to use. The new v0.4.0 release was tagged on Wednesday and delivers all of the Golioth features with less coding work needed to use them in your application. Highlights include:

  • Asynchronous function/callback declaration is greatly simplified
    • User-defined data can now be passed to callbacks
  • Synchronous versions of each function were added
  • API is now CoAP agnostic (reply handling happens transparently)
  • User code no longer needs to register an on_message() callback
  • Verified with the latest Zephyr v3.2.0 and NCS v2.1.0

The release brings with it many other improvements that make your applications better, even without knowing about them. For the curious, check out the changelog for full details.

Update your code for the new API (or not)

The new API includes some breaking changes and to deal with this you have two main options:

  1. Update legacy code to use the new API
  2. Lock your legacy code to a previous Golioth version

1. Updating legacy code

The Golioth Docs have been updated for the new API, and reading through the Firmware section will give you a great handle on how everything works. The source of truth continues to be the Golioth Zephyr SDK reference (Doxygen).

Updating to the new API is not difficult. I’ve just finished working on that for a number of our hardware demos, including the magtag-demo repository we use for our Developer Training sessions. The structure of your program will remain largely the same, with Golioth function names and parameters being the most noticeable change.

Consider the following code that uses the new API to perform an asynchronous get operation for LightDB State data:

/* The callback function */
static int counter_handler(struct golioth_req_rsp *rsp)
    if (rsp->err) {
        LOG_ERR("Failed to receive counter value: %d", rsp->err);
        return rsp->err;

    LOG_HEXDUMP_INF(rsp->data, rsp->len, "Counter (async)");

    return 0;

/* Register the LightDB Get callback from somewhere in your code */
static int my_function(void)
    int err;
    err = golioth_lightdb_get_cb(client, "counter",
                                 counter_handler, NULL);

Previously, the application code would have needed to allocate a coap_reply, pass it as a parameter in the get function call, use the on_message callback to process the reply, then unpack the payload in the reply callback before acting on it. All of that busy work is gone now!

With the move to the v0.4.0 API, we don’t need to worry about anything other than:

  • Registering the callback function
  • Working with the data (or an error message) when we hear back from Golioth.

You can see the response struct makes the data itself, the length of the data, and the error message available in a very straightforward way.

A keen eye already noticed the NULL as the final parameter. This is a void * type that lets you pass your user-defined data to the callback. Any value that’s 4-bytes or less can be passed directly, or you can pass a pointer to a struct packed with information. Just be sure to be mindful of the memory allocation lifespan of what you pass.

All of the asynchronous API function calls follow this same pattern for callbacks and requests. The synchronous calls are equally simple to understand. I found the Golioth sample applications to be a great blueprint for updating the API calls in my application code. The changelog also mentions each API-altering commit which you may find useful for additional migration info.

The Golioth Forum is a great place to ask questions and share your tips and tricks when getting to know the new syntax.

2. Locking older projects to an earlier Golioth

While we recommend updating your applications, if you do have the option to continue using an older version of Golioth instead. For that, we recommend using a west manifest to lock your project to a specific version of Golioth.

Manifest files specify the repository URL and tag/hash/branch that should be checked out. That version is used when running west update, which then imports a version of Zephyr and all supporting modules specified in the Golioth SDK manifest to be sure they can build the project in peace and harmony.

By adding a manifest to your project that references the Golioth Zephyr SDK v0.3.1 (the latest stable version before the API change) you can ensure your application will build in the future without the need to change your code. Please see our forum thread on using a west manifest to set up a “standalone” repository for more information.

A friendlier interface improves your Zephyr experience

Version 0.4.0 of the Golioth Zephyr SDK greatly improves the ease-of-use when adding device management features to your IoT applications. Device credentials and a few easy-to-use APIs are all it takes to build in data handling, command and control, and Over-the-Air updates into your device firmware. With Golioth’s Dev Tier your first 50 devices are free so you can start today, and as always, get in touch with us if you have any questions along the way!

One of the best tools in the Zephyr ecosystem is the ability to include different code modules in the build configuration. This feature leverages CMake and Kconfig, two tools that are core to Zephyr. But these tools aren’t limited to officially approved code modules. Today we’ll cover some CMake and Kconfig tricks for including common code in your own Zephyr applications.

The problem: common code across different variations of your app

Earlier this month we posted about the developer training that Golioth offers. It centers around a hardware development board called the MagTag, for which we’ve put together a code repository with many different examples. This includes code to drive the ePaper display, update four ws2812 LEDs, service button presses, and parse JSON objects.

The problem is that we need to use a common set of code in most, but not all of the training samples.

Luckily we’ve seen this problem before. The samples in the Golioth Zephyr SDK use common code for networking and shell settings features. Each feature from that common code has its own KConfig symbol, like GOLIOTH_SAMPLE_WIFI_SETTINGS, that is selected to include it in the build. By defining these in the board-specific conf files, libraries are only built for the devices that actually need them. For instance, an Ethernet-connected device doesn’t need WiFi settings.

The example from the Golioth SDK is a good one to study, but the MagTag implementation is a bit simpler so let’s walk through that code.

Golioth’s Developer Training is a self-guided experience that you can explore at your own pace. If you are interested in setting up a training with Golioth staff for your organization, please contact our Developer Relations team, we’d love to chat!

Directory Structure

Directory structure for a Zephyr project with multiple appsFrom the directory structure of the training you can see that we have five different Zephyr programs in this repository, each with their own boards and src subdirectories.

Common code for each app is placed in the magtag-common directory. Header files for each library are located in the the magtag-common/include/magtag-common directory. That might seem a bit convoluted, but it makes for sensible include names like #include "magtag-common/buttons.h".

Creating Kconfig symbols

Common code is individually enabled with a Kconfig symbol. There are a few ways to approach this. One of the easiest is to assign one symbol to indicate your app uses the common code, and then one symbol for each specific library in the common folder. This is done with a Kconfig file inside of the magtag-common directory.

menuconfig MAGTAG_COMMON
    bool "Common helper code for Golioth MagTag Demo"
      Build and link common code that is shared across MagTag samples.


    bool "Handle accelerometer readings"
      Get accel from DeviceTree and write readings to shared sensor struct

    bool "Process button reads"
      Configure buttons for interrupts with callbacks

    bool "2.9\" grayscale ePaper driver"
      Hardware driver for ePaper, including text and partial writes

config MAGTAG_WS2812
    bool "ws2812 helper functions"
      Intialize and update LED color and state


From this code snippet you can see the library-specific symbols are only defined if the MAGTAG_COMMON symbol has been selected.

Using CMake to build in the libraries

Now that we’ve created the Kconfig symbols, a CMakeLists.txt file in the magtag-common directory is used to build in the code based.

zephyr_library_sources_ifdef(CONFIG_MAGTAG_ACCELEROMETER accelerometer/accel.c)
zephyr_library_sources_ifdef(CONFIG_MAGTAG_BUTTONS buttons/buttons.c)
zephyr_library_sources_ifdef(CONFIG_MAGTAG_EPAPER epaper/magtag_epaper.c)
zephyr_library_sources_ifdef(CONFIG_MAGTAG_EPAPER epaper/magtag_epaper_hal.c)
zephyr_library_sources_ifdef(CONFIG_MAGTAG_WS2812 ws2812/ws2812_control.c)


The include directory is added by default, but the source files are only added if the associated symbol is defined.

Using the common code in a Zephyr application

So far, the common code is completely separate from the each of the apps we want to build. Let’s use the golioth-demo app as an example. To tell the app about the common code, the subdirectory needs to be added to the CMakeLists.txt file in the application subfolder.

cmake_minimum_required(VERSION 3.20.0)

list(APPEND OVERLAY_CONFIG "../credentials.conf")


add_subdirectory_ifdef(CONFIG_MAGTAG_COMMON ../magtag-common magtag-common)

target_sources(app PRIVATE src/main.c)

This codes makes sure that the common directory is only added to the build when the MAGTAG_COMMON symbol is selected. This completes the plumbing work. In this example, the app selects the common code in the prj.conf file:

# MagTag Common Files

And of course, you must include the header files to access the library functions. Here’s an excerpt of the main.c from the golioth-demo app:

/* MagTag specific hardware includes */
#include "magtag-common/magtag_epaper.h"
#include "magtag-common/ws2812_control.h"
#include "magtag-common/accel.h"
#include "magtag-common/buttons.h"

Wrapping up

MagTag Kconfig options shown in menuconfig

Abstracting your common code makes it a lot easier to maintain. This approach also adds entries in menuconfig for each of the libraries. This is a user-friendly feature as it allows you to convey more details in the help file. For more complex uses (like the Golioth samples common libs) it also makes it easier to see the symbol dependency hierarchy.

This example shows the libraries as a part of the repository, however this will work just as well if you commit them to their own repo. From there, they can be included as a git submodule, or with a little creativity you can get your west manifest file to pull the repo whenever west update is run. This is the approach that we’ll be taking with future development. Doing so allows us to lock the project to a specific hash of the common code so that future changes don’t break application code.

Understanding hardware and firmware is the first step towards building any kind of “thing” that will be part of an “Internet of Things” (IoT) deployment. For new chips and firmware ecosystems, this often means going through training.

Golioth Developer Relations (the group I’m part of) is tasked with making sure our users understand how to get their hardware connected to the Golioth Cloud. We have been doing in-person (as safety protocols allow) and remote training for Golioth users.

Last week, we did our first training for the broader public, a group of hardware engineers interested in learning more about Zephyr and Golioth. I wanted to discuss some of the tools and methods we used to do a fully remote hardware training. If you’re interested in taking part in the training in the future, please let us know in the associated forum post or by email. There are more details on the available options at the end of this post.

How we do fully remote hardware training

The main challenge with virtual gatherings is that trainees are not sitting in the same room as the trainers. This is an easy problem to understand, but a harder one to solve.


The first step was standardization on the hardware. To achieve that goal, we created a list of the bare minimum of parts and shipped them out to all attendees, even those who already had some of the hardware. We wanted to make sure that we had exactly the same hardware in hand. Our trainees were in the US and Europe, so we ordered through DigiKey, with DHL shipping at least a week in advance. The average cost was about $60 of materials in the US, and $85 to Europe, both numbers including shipping, VAT, tarriffs, etc.

For the hardware itself, we use the Adafruit MagTag. It contains the Espressif ESP32-S2 as the main processor on board. This time around we focused on Zephyr for this training, but the board also works great with the Golioth ESP-IDF SDK. Remember, Golioth has 3 SDKs currently (vanilla Zephyr, NCS, ESP-IDF), and we are always working to enable other hardware platforms.

Adafruit MagTag

Video meeting

You might think that after 3 years of a global pandemic, the world would have figured out video software. But the key difference between a simple video call and our remote training is that it’s not just 10 people sitting in a single video session. We want to be able to dynamically “move” between different conversations, especially as trainers.

To achieve this, we used It utilizes tiny avatars you can move around the screen with your keyboard like a video game. As your avatar gets closer to another avatar, your video and audio pops up to those in range. The rugs in the image below also allow “private” areas which work kind of like mini conference rooms. We found this allowed us to dynamically help smaller groups of people while also making announcements to the entire group. Trainees could also choose to work together on different components of the training material.

Our Training Location for Hardware Developers


The biggest challenge with any training is dealing with firmware tooling, specifically with Zephyr. We have been working on solving this issue for a while. The Zephyr installation continues to get simpler overall, thanks to the Zephyr dev team updating what gets installed for each user. But we still have trainees “walking through the door” (figurative in this case) with a wide range of computers where the tools will be installed. A range of different operating systems (Windows, Mac, Linux–with different distributions) means that there is no one standardized setup. What’s more, if someone has tried out Zephyr in the past, it’s possible there are interactions between the old installation and the new one, not to mention a variety of dependencies within each machine. If we could, we would ask everyone to show up with a brand new installation of Windows or Linux on a laptop…but that’s unlikely to happen.

We wrote in the past about using Kasm and Docker to have a standardized install. Last week’s training represents the first time we used it with real users. In short, it worked! We collaborated with the wonderful team at KasmWeb (makers of the Kasm tool) who gave us seats of their Cloud offering to use with our trainees.

Zero install embedded training with Zephyr using Kasm and Docker

The first step was created a dockerfile and a Docker image that contained everything required for the training. This way we were able to spin up a standardized, browser based UI environment for everyone taking the training. The entire Zephyr toolchain and Espressif support tools were pre-installed in that image, in addition to VScode, a terminal, a browser, and text editing tools.

Trainees open a browser window that looks like Ubuntu (it is, under the hood), and immediately start compiling Zephyr code. At past in-person training sessions, we had issues with bandwidth just downloading the tool packages. While we still needed bandwidth to communicate with the browser-based UI, all of the processing and storage is done on the cloud. Our plan is to use Kasm at our next in-person training.

We continue to evolve our strategy for making training a more seamless process and offering a range of different training to our customers.

Kasm instance of a Docker container with Zephyr / Espressif / Golioth tools installed

Training material

The final piece of the puzzle is the training material itself. Again, this is an evolving solution, because we continue to add more material, and because we track changes to the Zephyr SDK over time.

On, we take people through getting signed up for Golioth, creating a first binary for the MagTag using their Golioth credentials, and then launch them into understanding different parts of Zephyr. We take developers all the way through working with the DeviceTree and understanding how to interact with an in-tree sensor in Zephyr. Our goal is to help hardware and firmware engineers understand how to use some of the most critical parts of an RTOS, and then use those features to communicate back to the Cloud, thereby making their device an “IoT” based device.

If you’re interested in trying the training without Golioth present, you can do the “asynchronous” version of our training. is free to the public. You won’t have access to our Kasm server, but you will be able to install the Zephyr toolchain using supplemental directions. Please utilize the Golioth Forums for any questions you have about training.


We scheduled a 2 hour training session, as we know that engineers’ time is valuable. In that time, 100% of the trainees were able to get their devices connected to the internet, pass data back and forth between Device and Cloud, and explore the features of Zephyr. As in any training, each person worked at a different pace and was able to get to different checkpoints. Each trainee could use their hardware and training site at home for later study.

An unfortunate side effect of having multiple tools, including training materials, browser based UI, and video software is a lot of tab/window switching to keep track of it all. In-person training will cut down on the video window, but it would be nice to further integrate the training to include all required materials in one place. We had a “browser within a browser” (Firefox installed within the Kasm container), but that consumed a lot of virtual memory and wasn’t an optimal solution.

The other major restriction is that the binaries our trainees compiled were not directly loaded onto the MagTag hardware. Because we worked inside a virtual (browser based) container, there was no direct connection to the hardware. We also talked about this in the past article about Kasm based development. Because virtualization tools are created to be independent from the hardware, it is difficult to then reverse course and say, “We want this container to talk to USB!”.

We are working with the Kasm team and looking at different ways to program binaries directly from inside the container, and would love to hear about other options people have seen. We solved this during the training by having trainees install the Espressif esptooldirectly on their machines. This still involved some amount of install, but it was limited to a small program. When a trainee completed a build, they could download the binary image from the container to their local machine and load it onto the MagTag using esptool. After that, the MagTag was able to talk back to Golioth.

Future training

We are excited to continue iterating on this training in the future and would love to hear from people that are interested in participating. Here are the options you can try:

Zephyr threads, workers, and message queues

Zephyr is a Real Time Operating System (RTOS). That means it’s built to let you do multiple things at the same time using a single (or limited number) of processing cores. To be fair, you’re not doing things at the same time; the RTOS shares processing time across all of the tasks, with a priority system for the more important ones.

The trick with an RTOS is to design your applications so that they utilize the real-time-ness and you don’t miss reacting to real-time events like receiving data from a network, taking sensor readings, or reacting to user input.

In this post, we’ll discuss the difference between Zephyr threads and work queues and show you why you might want to use one versus the other. We’ll also discuss how to use message queues to pass around data between running processes.

Zephyr Threads

A thread is a set of commands to be executed by the CPU. The while(1) loop that runs inside of the main function of your application is a thread. But Zephyr allows you to create multiple threads. Each thread operates independently, with Zephyr’s built-in scheduler deciding when to run each thread.

For example, the Golioth Blue Demo run animations on the LEDs to indicate what mode the device was in. Instead of trying to get the loop in the main thread to update the lights on a tight schedule, we use a thread to run the animations. The main thread of the program deals with flow of the things happening on the device, and it can suspend/resume the animation thread as necessary.

I like to think of threads as tasks that need keep running, things that don’t return like a discrete function would. Your thread should have a while(1) and can call other functions just like you would in your main loop. It is up to you to yield processor control back to the scheduler so that it may run other threads. This is pretty easy, just call any of the k_sleep() functions or k_yield().

/* Every thread needs its own stack in RAM */

/* Define and initialize the new thread */
            animate_ping_thread, NULL, NULL, NULL,

/* All animations handled by this thread */
extern void animate_ping_thread(void *d0, void *d1, void *d2) {
    uint8_t spinner_idx = 0;
    while(1) {
        led_state &= ~LEDS_LOGO;
        led_state |= leds_logo_order[spinner_idx];
        if (++spinner_idx > 7) spinner_idx = 0;
        /* Control is yielded back to the scheduler during */
        /* sleep between animation frames                  */

int main(void) {
    /* Start animation thread running */

    while(1) {
        /* normal program flow */

The example shown here runs an animation when an IoT device is trying to connect to a network. When it begins running, the LEDs will update every 100 milliseconds. Because the loop sleeps in between frames, control is given back to the scheduler or other tasks. Elsewhere in the program, we call k_thread_suspend(animate_ping_tid) and k_thread_resume(animate_ping_tid) to stop and start the animation.

Just be careful that you don’t have multiple threads trying to access the same resource at the same time (protect those resources with a mutex).

Zephyr Work Queues

A Zephyr workqueue is a thread with extra features bolted on. It also requires less setup and management than threads.

The work queue has a built-in buffer to store pending work (functions you want to run). Each item you add will be executed in the order you submitted it, like a “to-do list” for your processor. This is a perfect place for something that you want to run once and return from, but don’t want to do it inside of an interrupt service routine.

The “work” you submit to a work queue is a function. So you’re telling the work queue “run this function, then run this other function, then run this other…” you get the idea. Just set it and forget it–the Zephyr scheduler will run take care of popping work out of the queue and running it until the queue is empty. Many of the sensor readings we do in our demos are handled through work queues.

The work queue expects a specific type of function usually referred to as a “work handler”. You don’t actually give the handler any information, you just tell your work queue you want to add your handler to the list of pending work.

/* Work handler function */
void button_action_work_handler(struct k_work *work) {

/* Register the work handler */
K_WORK_DEFINE(button_action_work, button_action_work_handler);

/* Add a work item to the system workqueue from a button interrupt function */
void button_pressed(const struct device *dev, struct gpio_callback *cb,
            uint32_t pins)

This code snippet demonstrates using a work queue when a button is pressed. You want to spend as little time as possible in the button interrupt function. All this handler does is queue up a button action function that will be run by the scheduler, sometime after the interrupt service routine ends.

The example above uses the system workqueue. You also have the option of declaring your own workqueues which will need its own stack (remember… it’s a thread with extra features). If your application is freezing up when using the system workqueue, check to see if you’ve run our of stack space and set CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE accordingly.

The challenge here is how to get data to the worker since we can’t pass any parameters. How do you know which button was pressed? One approach is to place your work item inside of a struct along with the data you want to pass (button number, etc.). Then in the handler you can use CONTAINER_OF() to access the data in the structure. This is beyond the scope of today’s post, so let’s talk about a bit simpler approach we often take: using a message queue.

Zephyr Message Queues

A Zephyr Message Queue is a collection of fixed-sized data that is safe to access from multiple threads. You decide what the data is (a variable, a struct, a pointer, etc.) and how many of those objects the queue can contain (limited by your available RAM). Zephyr handles all of the logic necessary for adding to and removing from the queue.


void button_pressed(const struct device *dev, struct gpio_callback *cb,
            uint32_t pins)
    /* add the current button states to the message queue */
    k_msgq_put(&button_action_msgq, &pins, K_NO_WAIT);

    /* This is a good place to queue a worker to process the buttons */

/* Work handler to process actions from button presses */
void button_action_work_handler(struct k_work *work) {
    /* Process everything in the message queue */
    while (k_msgq_num_used_get(&button_action_msgq)) {
        uint32_t button_states;
        k_msgq_get(&button_action_msgq, &button_states, K_NO_WAIT);

        /* Run some function based on which button was pressed */
        if (pins & BUTTON_0) {
        /* Give the schedule a chance to run other tasks */

Continuing with our button analogy, the example above uses a message queue to record the button states for processing after the interrupt service routine returns. I’ve highlighted the lines that are crucial to the message queue system. Notice that the work handler uses a while loop to check if the message queue is empty. This is a nice pattern for processing all of the queued message. I’ve used k_yield() between loops to give the scheduler a chance to run other tasks.

Message queues are a spectacular tool for IoT applications. Golioth has used Message Queues to cache GPS readings while the cell modem is turned off on our Orange Demo. Periodically, the modem will turn on, send all of the readings from the queue to the Golioth servers, then turn off the radio once again to save power.

Threads, Workers, and Queues

We often focus on the “ecosystem” aspect of Zephyr, because we like that so many silicon vendors contribute to the code base. But it’s important to remember that Zephyr is a Real Time Operating System and has a fully featured scheduler. We’ve only just scraped the surface of what you can do with it. Hopefully the above explanations helped you to begin thinking in these patterns, understanding how to divide up the application work, and how to pass data between different parts of your code.