
Open-source on-device speech SDK — STT (114 languages), TTS, VAD, noise cancellation. No cloud APIs
We've been building an on-device speech SDK for Android and embedded Linux. Everything runs locally — no data leaves the device.
What it does:
- Speech recognition — Parakeet TDT v3, 114 languages, ~150ms latency
- Text-to-speech — Kokoro 82M, natural English voice
- Voice activity detection — Silero VAD v5
- Noise cancellation — DeepFilterNet3
- Full pipeline: listen → transcribe → speak → listen (barge-in supported)
How it works:
- ONNX Runtime inference (CPU / NNAPI on Snapdragon, Exynos, Tensor)
- C++17 core, thin Kotlin wrapper
- Models auto-download from HuggingFace (~1.2 GB total)
- Apache 2.0
Also has an embedded Linux C API for automotive (Qualcomm SA8295P / Yocto).
GitHub: https://github.com/soniqo/speech-android
Would love feedback, especially on real device performance.






