Whisper Transcription

Local speech-to-text using OpenAI Whisper. No API key needed, audio stays on your server.

tested

The Story

Enabling voice messages to AI agents without external API dependencies. Whisper was added because Brian realized that voice-first workflows break when the agent can't actually hear what you're saying. OpenClaw doesn't transcribe audio by default — Whisper fixes that.

What It Does

Local speech-to-text using OpenAI Whisper — no API key needed, audio stays on your server.

The Problem

OpenClaw doesn't transcribe audio by default. When users send voice messages:

They appear as audio attachments
No transcript is generated
The agent can't read the content

This breaks voice-first workflows. If you want to just talk instead of type, tough luck.

The Solution

Enable OpenAI Whisper for audio transcription in OpenClaw. Whisper is an open-source speech-to-text model that runs locally — no external API calls, no costs per minute, no data leaving your server.

How It Works

Voice message is captured in the conversation
Whisper CLI processes the audio locally
Transcript is returned as text
Agent can read and respond to the content

No data leaves your server. No API costs. Just local transcription.

Setup

Whisper is installed as a skill in DEWER:

openclaw skills install chiptrack

Once installed, voice messages are transcribed automatically when sent to DEWER.

Benefits

No API costs — Uses local Whisper model
Privacy — Audio stays on your server
Reliability — No external service dependencies
Speed — Local processing is fast

Use Cases

Enables hands-free interaction with DEWER:

Mobile users who prefer voice
Long-form input (speaking is faster than typing)
Accessibility — helps users who prefer voice
Capturing ideas while on the go

Ideas for Refinement

Add speaker identification for multi-person audio
Support for longer recordings
Punctuation and formatting improvements
Multiple language support

Last updated: 2026-04-20

📋 Built to content standard: best answer · unique source of truth · strong opinions · elite developer positioning · unique data