u/ritzynitz

First, thanks to everyone who tried OpenVox last month after the last post and reported bugs. Got a bunch of DMs and comments, fixed several things I'd completely missed. Genuinely helpful, keep it coming.

Now for what's new in 1.4.

OmniVoice

The model lineup now includes OmniVoice, a next-gen model for ultra-realistic, expressive, context-aware speech with voice cloning. The part that surprised me most: it supports 600+ languages. Not just the obvious ones. Hindi, Arabic, Japanese, French, German, Spanish, Portuguese, Korean, Turkish, Ukrainian, Hebrew, Swahili, Tamil, Polish, Dutch, Greek, Swedish, Indonesian, Czech, Bengali... well beyond the usual English/Spanish/French tier that most TTS tools bother with.

If you've ever wanted to generate audio in a language that every cloud tool treats as an afterthought, this might be worth trying.

Current model lineup:

OmniVoice → 600+ languages, expressive, voice cloning
Qwen3 → highest quality English, cloning
Kokoro → fast, great for long-form
Chatterbox → expressive, character-style voices

Also new in 1.4: EPUB support alongside TXT & PDF, so you can turn your ebooks into audio too. All local, no upload anywhere.

Still the same pricing:

Free tier: 5,000 chars/day, 10 Voice Designs, 3 Voice Clones

Pro: $19.99 one-time (no subscription)

App Store: https://apps.apple.com/in/app/openvox-local-voice-ai/id6758789314?mt=12
More Information: https://openvoxai.com/

Happy to answer questions, and if you run into anything broken, drop it here or DM me.

I've been building OpenVox, a local TTS app for Mac that lets you switch between multiple SOTA models depending on what you need. Just launched v1.4 with a new model called OmniVoice and wanted to get feedback from people who actually know TTS.

Model lineup:

OmniVoice (new) → 600+ languages, expressive, context-aware, voice cloning
Qwen3 → best quality for English, great for cloning
Kokoro → fast, handles long-form well
Chatterbox → more expressive, good for character voices

The multi-model approach has been the most useful thing for me personally. No single model wins everything, so being able to switch per use case without juggling different tools or APIs is nice.

OmniVoice language coverage

This is the part I think this sub will appreciate. Most local TTS solutions are effectively English-first with a few extras. OmniVoice covers Hindi, Arabic, Japanese, French, German, Spanish, Portuguese, Korean, Turkish, Ukrainian, Hebrew, Swahili, Tamil, Polish, Dutch, Greek, Swedish, Indonesian, Czech, Bengali and a lot more, 600+ total. Expressive and context-aware across all of them, not just English.

Other features

Voice cloning, voice design (text description to voice), PDF and EPUB to audio, voice conversion on existing files. Everything runs locally on Apple Silicon, no API calls, no usage limits beyond the free tier.

Pricing

Free tier: 5,000 chars/day, 10 Voice Designs, 3 Voice Clones Pro: $19.99 one-time, no subscription

App Store: https://apps.apple.com/in/app/openvox-local-voice-ai/id6758789314?mt=12
More Information: https://openvoxai.com/

Curious what this community thinks about the model choices and whether there are gaps you'd want to see filled.

OpenVox v1.4 just dropped - added a model that speaks 600+ languages locally on Mac

Local TTS on Mac just got a lot more interesting, 600+ languages, voice cloning, no cloud