u/AkamazZz — reddlx

Best small local LLMs and libraries for mobile apps?

Hey everyone,

I’m researching small local LLMs for mobile apps and trying to choose what model/runtime stack is worth testing first.

The use case is not general chat. I need basic local text processing: summarization, rewriting, extracting structured fields, generating JSON/Markdown-like output, etc.

I’m mostly interested in what is actually practical on iOS and Android.

Models I’m considering:

Qwen 0.5B / 0.6B / 1.5B
Gemma small models
Phi small models
any other mobile-friendly model you would recommend

Libraries/runtimes I’m considering:

llama.cpp / GGUF
MLC LLM
MediaPipe GenAI
ExecuTorch
ONNX Runtime
llama.rn
native wrapper exposed to Flutter
any Flutter-friendly package if it is actually usable

My main questions:

Which small model would you test first for mobile?
Which runtime/library would you pair it with?
Is GGUF + llama.cpp still the most practical default choice?
Are Qwen 0.6B / 1.5B good enough for structured output on-device?
Is Gemma or Phi better for this kind of use case?
What quantization level gives the best balance between size, RAM, speed, and quality?
Are there libraries that work well from Flutter, or should I expect to write native bindings?
What stack would you avoid based on real-world experience?

Main constraints:

iOS and Android
Flutter app
Offline/local inference preferred
Structured output matters more than open-ended chat quality
Reasonable app size
Acceptable speed on mid-range devices
Native integration is okay if needed

I’m mainly looking for practical recommendations: model + runtime/library combinations that are worth trying first, and any examples or repos that helped you.

Thanks!

reddit.com

u/AkamazZz — 1 day ago

▲ 4 r/flutterhelp

Best local/offline speech transcription options for Flutter mobile apps?

I’m researching speech transcription options for a Flutter mobile app and trying to understand what is currently practical on iOS and Android.

The main use case is simple: record audio and transcribe it locally or semi-locally. It does not have to be real-time — file-based transcription is completely fine.

I’m currently looking at:

Whisper / whisper.cpp
ONNX-based Whisper models
sherpa-onnx
native iOS Speech APIs
Android SpeechRecognizer / related APIs
other offline ASR models or libraries

My main questions:

What is currently the most practical option for local/offline transcription on mobile?
Is Whisper still the default choice, or are there better alternatives for mobile?
For Flutter, would you recommend an existing package, FFI, or native platform channels?
How realistic is word-level timestamp support on iOS and Android?
Are there good examples of file-based transcription pipelines in Flutter?
What are the main issues with performance, battery usage, app size, and model size?

Main constraints:

Flutter app
iOS and Android
Preferably offline/local
File-based transcription is okay
Real-time is optional
Word-level timestamps would be a plus
Should work reasonably well on mid-range devices

I’m mainly interested in real-world experience: what actually works, what is too slow, what breaks on mobile, and which libraries are worth testing first.

Thanks!

reddit.com

u/AkamazZz — 1 day ago

▲ 6 r/ObsidianMD

How do you turn voice thoughts into actual notes, not just transcripts?

I’m trying to understand how people handle note writing when the original thought starts as voice.

Obsidian works really well when the thought is already written down as text. But a lot of my ideas start differently: I’m walking, thinking out loud, recording a quick voice note, or just dumping a messy chain of thoughts.

I know there are audio recorders and transcription/Whisper plugins, but transcription alone still gives me a raw block of text. The annoying part is what comes after:

separating mixed topics
extracting actual tasks
turning vague thoughts into proper notes
splitting one long brain dump into smaller notes
adding tags/categories
deciding what should go into Daily Notes vs project notes
keeping the raw transcript without making the vault messy

So I’m curious: how do you currently turn voice notes into useful Obsidian/PKM notes?

Do you manually clean up transcripts, use templates, use AI prompts, use plugins, or just avoid voice capture completely?

reddit.com

u/AkamazZz — 2 days ago