Using the Hugging Face Inference API for Device Audio Analysis

Golioth Pipelines works with Hugging Face, as shown in our recent AI launch. This post will highlight how to use an audio classification model on Hugging Face that accepts data recorded on a microcontroller-based device, sent over a secure network connection to Golioth, and routed through Pipelines.

While most commonly known as the place where models and data sets are uploaded and shared, Hugging Face also provides a compute service in the form of its free serverless inference API and production-ready dedicated inference endpoints. Unlike other platforms that offer only proprietary models, Hugging Face allows access to over 150,000 open source models via its inference APIs. Additionally, private models can be hosted on Hugging Face, which is a common use case for Golioth users that have trained models on data collected from their device fleets.

Audio Analysis with Pipelines

Because the Hugging Face inference APIs use HTTP, they are easy to target with the webhook transformer. The structure of the request body will depend on the model being invoked, but for models that operate on media files, such as audio or video, the payload is typically raw binary data.

In the following pipeline, we target the serverless inference API with an audio sample streamed from a device. In this scenario, we want to perform sentiment analysis of the audio, then pass the results onto Golioth’s timeseries database, LightDB Stream, so that changes in sentiment can be observed over time. An alternative destination, or multiple destinations, could easily be added.

Click here to use this pipeline in your project on Golioth.

filter:
  path: "/audio"
steps:
  - name: emotion-recognition
    transformer:
      type: webhook
      version: v1
      parameters:
        url: https://api-inference.huggingface.co/models/superb/hubert-large-superb-er
        headers:
          Authorization: $HUGGING_FACE_TOKEN
  - name: embed
    transformer:
      type: embed-in-json
      version: v1
      parameters:
        key: text
  - name: send-lightdb-stream
    destination:
      type: lightdb-stream
      version: v1

Note that though Hugging Face’s serverless inference API is free to use, it is rate-limited and subject to high latency and intermittent failures due to cold starts. For production use-cases, dedicated inference endpoints are recommended.

We can pick any supported model on Hugging Face for our audio analysis task. As shown in the URL, the Hubert-Large for Emotion Recognition model is targeted, and the audio content delivered on path /audio is delivered directly to Hugging Face. An example for how to upload audio to Golioth using an ESP32 can be found here.

Results from the emotion recognition inference look as follows.

[
  {
    "score": 0.6310836672782898,
    "label": "neu"
  },
  {
    "score": 0.2573806643486023,
    "label": "sad"
  },
  {
    "score": 0.09393830597400665,
    "label": "hap"
  },
  {
    "score": 0.017597444355487823,
    "label": "ang"
  }
]

Expanding Capabilities

Countless models are uploaded to Hugging Face on a daily basis, and the inference API integration with Golioth Pipelines makes it simple to incorporate the latest new functionality into any connected device product. Let us know what models you are using on the Golioth Forum!

Dan Mangum
Dan Mangum
Dan is an experienced engineering leader, having built products and teams at both large companies and small startups. He has a history of leadership in open source communities, and has worked across many layers of the technical stack, giving him unique insight into the constraints faced by Golioth’s customers and the requirements of a platform that enables their success.

Post Comments

No comments yet! Start the discussion at forum.golioth.io

More from this author

Related posts

spot_img

Latest posts

Golioth Design Partners: IoT Solutions for Constrained Devices | 2025 Partner Network

We regularly refer Golioth users to our trusted Design Partners to help design, test, and deploy hardware out into the world. The Golioth Design Partner program has been going for more than 2 years and continues growing. In 2025, we reached 20 listed partners, with others in the wings.

Adding Golioth Example Code to Your ESP-IDF Project

In our previous blog post, we demonstrated how to add the Golioth Firmware SDK to an ESP-IDF project. As you start integrating Golioth into...

Tracking Our CEO at CES

We used Golioth Location to build an application that keeps up with Golioth CEO Jonathan Beri at CES 2025.

Want to stay up to date with the latest news?

We would love to hear from you! Please fill in your details and we will stay in touch. It's that simple!