Turn Off Subsystems Remotely To Reduce Power Usage with Golioth and Joulescope
As we previously wrote about, we attended the first Embedded World North America, held in Austin, Texas on October 8-10th, 2024. Part of our time there was to showcase Golioth’s Cloud + low power capabilities at the Joulescope booth. In this post and video, we’ll explain how to trigger different low power modes and how we measured the output.
Goals for the demo
When we were invited to showcase alongside Joulescope, I knew I wanted to be able to turn off elements of the PCB that I believed consumed power. This meant two things:
- Reviewing how Golioth can trigger actions from the cloud
- Calling the appropriate APIs from Zephyr
The first part I have done many times before. I targeted using Remote Procedure Calls (RPCs) and Settings to push information to the device. However, I could have also used LightDB State. Each has their place in different IoT setups, but as I’ll explain later, Settings seemed to work best for low power contexts. I wrote a Guide on how to add a new setting to a Golioth project a couple days ago.
All of the code mentioned here is targeted at our Aludel Elixir board. The basis for the code is our Reference Design Template, which also targets the nRF9160-DK, but that is off-the-shelf hardware that already has the kinks figured out (thanks Nordic!), so there was less interesting stuff I wanted to do. I will try to target that hardware in a future post as well.
Calling APIs to trigger lower power behavior
As you can see in the Golioth Joulescope demo repo on GitHub, the APIs we call are captive in app_rpc.c
and app_settings.c
, since that’s where we triggered these actions. I created two new RPCs to turn on rails and see what happened. The most extreme was the 3V3 rail, which controls the sensor + WiFi section of the board using a “downstream” power switch. You can see in the video that I trigger this and the power draw goes up nearly 5x. I also was able to use the same logic for the 5V rail. Both of these utilized the “Regulator” subsystem in Zephyr. I needed to modify the device tree using an overlay file to make sure these two rails were not turned on by default (an article for another day, basically the opposite behavior of this article). I triggered the behavior using these API calls:
I also wanted to trigger less frequent communication with the cloud, simulating a “sleepy device”. For this, I leaned on eDRX or Extended Discontinuous Reception. I had been calling the API using an RPC, but it wasn’t super optimized for low power modes. For instance, if I have eDRX mode enabled, I won’t be checking into the tower all that often; if I call the RPC to disable eDRX, there’s a chance the RPC will timeout in our 10 second window. Interestingly, we figured out that the first call will get cached at the tower (after a 10 second timeout) and the second will often return as a failure. Either way, it’s not a super reliable way to both send a command to a device and ensure it’s properly been received (nor when it has been received).
Instead, I moved the eDRX to use the Golioth Settings Subsystem. This worked great because the change in setting is transmitted to the device and it will send a callback to the server whenever it has been received. This is purpose built for asynchronous operation in low power states. Until the server receives the GOLIOTH_SETTINGS_SUCCESS
enum back from the device, the Console will show that the device as “Synchronized” or “Out of sync”. Now whenever the device in eDRX mode is checking back in with the tower, it will see that an update is available and will do the synchronization.
Other modes we enabled
In addition to triggering different APIs remotely, I also set up the board to be in a lower power state to start with. This included turning off a good portion of the peripherals that would draw power. I followed Marko’s article for optimizing power on nRF9160 boards.
I had moved the shell output from UART0 to UART1 on the Elixir board (utilizing the MikroBus headers and plugging in a USB to Serial converter) because I was not using the USB to serial chip. I didn’t want to have the 5V from the USB interfering with the measurement, nor did I want to have that chip (CP2102) siphoning power from the rest of the board unknowingly. Even after all the work that I did the move over to using UART1, it was a pretty large power hog (<500 uA – 1 mA) because it’s always listening for commands. Instead, I relied on the Golioth Logging Subsystem, so I could see what was happening on the cloud. Since we were sending on an infrequent basis, adding in a couple of logs didn’t add much to the overall power when the device woke up from “sleepy mode”. If they did, I could always turn down the logging level using our standard set_log_level
RPC in the Reference Design Template. See the demo video above for more detail.
How the measurement works
The Joulescope is a great tool for measuring low current applications. It has a 1 nA resolution, but is accurate to around 30 nA out of the box (way more accuracy than I need, sad to say); it also has super fast range switching, so catching spikes during RF transmissions are captured as well. You can see different modes activated in the demo that we shot with Matt Liberty at Embedded World North America.
In this setup, we have an off-the-shelf power supply that is simulating the Lithium-Ion battery we normally have plugged in. Then we use the Joulescope to measure the voltage and current as we flow current through the 2 mm JST battery connector on the Elixir board. We’re able to capture peaks and sleep currents of the design in different modes. See the video above for live views.
Power monitoring as a troubleshooting tool
One interesting behavior is the NAT timeout that Dan mentioned last week. I had been playing around with different KConfig settings, trying longer and longer keep alives and timeouts. With Dan’s help, we narrowed in on the 2 minute mark as a timeout, as explained in the “Configuring the Golioth Firmware SDK for Sleepy Devices”. Seeing the device using outsized amounts of current despite us having Connection ID enabled helped us narrow in on different parts of the system. In that case it was the NAT for our MVNO, and the fact that Connection ID wasn’t configured correctly on my project. Once we pushed a change for the Connection ID code, we were able to push the sleepiness of the device even further out. Connection ID also obviates the need to worry about NAT timeout for any particular MVNO, which means the Golioth SDK is helping to standardize offerings from different carriers.
Future Improvements
The cool thing about this kind of activity/demo is that it allows us to isolate and measure the power impact of each action. That means we can assign a “cost” to things like:
- Doing an OTA update (including pushing out various artifacts using Cohorts)
- Sending single log messages
- Sending single Stream messages to push to different services using Pipelines
- Connecting to the tower and the value/cost of ConnectionID
As I mentioned above, I’d also love to target some of this behavior directly at the nRF9160-DK and other development boards. It can be useful to be able to remotely trigger lower power modes, but it’s often tied directly to the capabilities of boards.
We really enjoyed optimizing for lower and lower power and will be writing more about this topic in the future. If you have suggestions, please let us know on the forum. If you want help getting your IoT device to lower and lower current levels, please get in touch with Golioth Solutions.
Start the discussion at forum.golioth.io