u/ivan_digital

Offline speech recognition + TTS for Android — no Google APIs, no cloud, no data collection

Offline speech recognition + TTS for Android — no Google APIs, no cloud, no data collection

Built an open-source speech SDK that runs entirely on-device. Zero Google dependencies.

What it replaces:

- Google Speech-to-Text API → Parakeet TDT v3 (114 languages, ~150ms)

- Google Text-to-Speech → Kokoro 82M (natural English voice)

- Google Voice → Full pipeline with barge-in support

What's different:

- No network requests. Works in airplane mode

- No GMS/Play Services required

- Models stored locally (~1.2 GB, auto-downloaded from HuggingFace on first run)

- Voice activity detection + noise cancellation built in

- Apache 2.0 — fully open source, no telemetry

Works on any arm64 Android device. No root needed.

GitHub: https://github.com/soniqo/speech-android

u/ivan_digital — 4 hours ago
Open-source on-device speech SDK — STT (114 languages), TTS, VAD, noise cancellation. No cloud APIs
▲ 23 r/androiddev+1 crossposts

Open-source on-device speech SDK — STT (114 languages), TTS, VAD, noise cancellation. No cloud APIs

We've been building an on-device speech SDK for Android and embedded Linux. Everything runs locally — no data leaves the device.

What it does:

- Speech recognition — Parakeet TDT v3, 114 languages, ~150ms latency

- Text-to-speech — Kokoro 82M, natural English voice

- Voice activity detection — Silero VAD v5

- Noise cancellation — DeepFilterNet3

- Full pipeline: listen → transcribe → speak → listen (barge-in supported)

How it works:

- ONNX Runtime inference (CPU / NNAPI on Snapdragon, Exynos, Tensor)

- C++17 core, thin Kotlin wrapper

- Models auto-download from HuggingFace (~1.2 GB total)

- Apache 2.0

Also has an embedded Linux C API for automotive (Qualcomm SA8295P / Yocto).

GitHub: https://github.com/soniqo/speech-android

Would love feedback, especially on real device performance.

u/ivan_digital — 4 hours ago