H
You
"Run db:migrate on production and verify the schema"
OpenClaw BOT
"Migration complete. 14 tables updated, all constraints valid. ✅"
H
You
"I'm at the gym — what exercise should I do next for chest?"
OpenClaw BOT
"Bench, incline dumbbell press, cable flys. Want me to track your sets? 💪"
H
You
"Spin up staging and deploy branch feature/auth"
OpenClaw BOT
"Staging is live at staging-auth.fly.dev — branch deployed. 🚀"
H
You
"Merge PR #47, run the test suite, deploy to prod"
OpenClaw BOT
"Merged, 47/47 tests passing. Production deploy at v2.3.1."
H
You
"Remember that content idea from last night? Let's flesh it out"
OpenClaw BOT
"Pulled up your notes. The 'AI agents for dentists' thread — want to outline it?"
H
You
"Pull last 24h analytics and graph the conversion funnel"
OpenClaw BOT
"Funnel report ready. 12.4% conversion, up 2.1% from yesterday. 📊"

OpenVoice

Give your OpenClaw
a voice.

Talk to your OpenClaw from your phone, 24/7.
At the gym, on a walk, wherever — just open Discord and speak.

Get Access — $10
Secured with Stripe · Lifetime access
Launch price ends in 48:00:00 — then $20
Your OpenClaw, with a voice Your existing API keys Link to existing Discord No subscription Powered by ElevenLabs Powered by OpenAI Whisper

Your agent. In voice. Right now.

Your OpenClaw sits in Discord Voice Chat ready to cook whenever you are. Say its name, it'll activate and you can start working together.

OpenClaw agent in Discord Voice Channel

Simply add your OpenClaw to your voice channel(s)

OpenClaw responding to voice commands in Discord chat

Full transcriptions sent to voice chat channel

Think about it

Your entire OpenClaw.
In your ear.

Your OpenClaw already has access to your files, your GitHub, your APIs, your tools. It can deploy code, check emails, manage servers, write scripts — everything.

Now imagine you don't need to type any of that. You don't need your laptop open. You don't even need to look at a screen. You don't even need voice to text.

Just say "Hey Midir, deploy the staging branch" — and it's done. That's it. Jarvis in your ear, except it's YOUR agent with YOUR permissions.

"Hey, remember that content idea from last night? Let's flesh it out right now."

↑ On a walk. No screen. Just talking.

"Check if the PR passed CI and merge it if it's clean."

↑ Between sets at the gym.

"Read me the last 3 emails and draft a reply to the one from the client."

↑ Driving home. Hands on the wheel.

You in five minutes ↓

You in five minutes

Demo

See it in action

Watch your OpenClaw respond to voice commands in real time.

Video coming soon

Setup

Three steps. Five minutes.

01

Pay

$10 (48hr) or $20. One-time payment, no subscriptions. Lifetime access to the private repo.

02

Clone

Check your email for the GitHub invite. Accept it, clone the private repo to your machine.

03

Run

Personalize your setup — add your API keys, choose your voice, pick your model. Your OpenClaw agent joins voice in minutes.

Make it yours

Optimize your setup

Here's how you can power this however YOU want.

💰

Use a cheaper model

Use MiniMax or Kimi K2.5 instead of your main model — $10/mo is all you need. Keep your main session context clean.

💡 Best for: Keeping costs low
🏠

Run Whisper locally

Run Whisper locally for free STT — no API costs, slightly slower but completely private. No internet needed.

💡 Best for: Privacy & zero ongoing costs

Upgrade ElevenLabs

Paid tier for faster TTS, more voices, higher quality. Worth it if you want the best voice experience.

💡 Best for: Premium voice quality
🔑

Main OpenClaw model

Use your existing OpenClaw API keys here too. Might clog session context — we suggest a cheap model if you want to keep things separate.

💡 Best for: Simplicity
🪶

Light on tokens

Runs as an isolated subagent — zero context used from your main session. Or swap in Gemini Flash for free casual voice chat (1M+ free tokens, ~1s responses). Requires additional setup.

💡 Subagent = full power, zero bloat. Gemini = free + fast.
⚙️

Fully customizable

Make it as smart or simple as you want. Spend $50/mo or $10/mo — your choice. Nothing new needed.

💡 Best for: Flexibility

FAQ

Common questions

How does it work? +
Your OpenClaw joins a Discord voice channel as a bot. When you speak, Whisper transcribes your voice to text. That text gets sent to your OpenClaw, which processes it like any other message. Then ElevenLabs converts the response back to speech and plays it in the voice channel. You talk, it talks back. Full conversation, hands-free.
Why should I pay $10 for this? +
You don't have to. You could probably vibecode this entire thing yourself. But will you? Ever? Or do you just want access NOW and not have to deal with all that? $10 is cheaper than the 3+ hours you'd spend building, debugging, and wiring it all together. Your call.
What's the cheapest way to run this? +
Use cheaper models like Kimi K2.5 or MiniMax ($10/mo), run Whisper locally (free), and use ElevenLabs free tier. Total cost: ~$10/mo. You can also use your main API key — voice commands don't burn many tokens.
How will this affect my OpenClaw? +
This IS your OpenClaw. It gives it a voice you can speak to. Whisper transcribes what you say, your OpenClaw processes it, and ElevenLabs turns its response into speech. Same agent, new interface.
Do I need to pay for ElevenLabs? +
Nope — the free tier works fine for personal use. You get 10k characters/month which is plenty for testing and light usage. Upgrade only if you want more voices or higher quality.
Can I run Whisper locally? +
Yes — run it directly on your OpenClaw's hardware. Won't increase CPU usage by much, your voice never leaves your machine, and it's usually faster than the API since there's no network round trip. You can also use the OpenAI Whisper API if your hardware is older.
How hard is the setup? +
Super easy — plug and play your API keys. Just make an ElevenLabs account, grab your Discord bot token, and run the script. The README has copy-paste commands. Most people are running in under 10 minutes.
What if I use my main API key? +
Totally fine — voice commands are short and won't break the bank. But it will fill your session with voice transcripts and use up context. If you're okay with that, keep everything under one API key or subscription — no need for another one. If you want it clean, use a cheap separate model.