u/ivan_digital

Built an open-source speech SDK that runs entirely on-device. Zero Google dependencies.

What it replaces:

- Google Speech-to-Text API → Parakeet TDT v3 (114 languages, ~150ms)

- Google Text-to-Speech → Kokoro 82M (natural English voice)

- Google Voice → Full pipeline with barge-in support

What's different:

- No network requests. Works in airplane mode

- No GMS/Play Services required

- Models stored locally (~1.2 GB, auto-downloaded from HuggingFace on first run)

- Voice activity detection + noise cancellation built in

- Apache 2.0 — fully open source, no telemetry

Works on any arm64 Android device. No root needed.

u/ivan_digital — 4 hours ago

▲ 23 r/androiddev+1 crossposts

We've been building an on-device speech SDK for Android and embedded Linux. Everything runs locally — no data leaves the device.

What it does:

- Speech recognition — Parakeet TDT v3, 114 languages, ~150ms latency

- Text-to-speech — Kokoro 82M, natural English voice

- Voice activity detection — Silero VAD v5

- Noise cancellation — DeepFilterNet3

- Full pipeline: listen → transcribe → speak → listen (barge-in supported)

How it works:

- ONNX Runtime inference (CPU / NNAPI on Snapdragon, Exynos, Tensor)

- C++17 core, thin Kotlin wrapper

- Models auto-download from HuggingFace (~1.2 GB total)

- Apache 2.0

Also has an embedded Linux C API for automotive (Qualcomm SA8295P / Yocto).

Would love feedback, especially on real device performance.

u/ivan_digital — 4 hours ago