
Hey everyone,
I’m a high school student working on a native Android project to get better at the ecosystem. I’ve been building a Text-to-Speech reader and I’ve run into a few architectural questions I’d love some "pro" eyes on.
I’m using the native TextToSpeech engine and ML Kit for OCR, but I’m curious about the "correct" way to handle a few things I’ve implemented:
- Floating Windows: I’m using a
Serviceto manage a floating overlay. Is this still the standard for Android 13/14, or is there a more lifecycle-aware way to keep an overlay responsive without killing the background task? - Document Parsing: I’ve implemented
.pdfand.docxparsing locally. I’m curious if my approach to extracting text before passing it to the TTS queue is efficient, or if I should be streaming it to avoid memory spikes on larger files. - The TTS Queue: Right now I’m just using the standard
QUEUE_ADDlogic, but I’m wondering if I should be wrapping the engine in a custom manager to handle interruptions better.
I’m really trying to move away to build it right. If anyone has a few minutes to look at my MainActivity or my Service logic and roast my architecture, I’d appreciate it.
Repo: https://github.com/Vishwesh-AIENG/Text-to-Speech-Reader-Android-App
I’m not looking for users or testers, just hoping to get some stars and advice from people who actually do this for a living. Thanks!