u/AdministrativeFlow68

▲ 4 r/tts

Hey r/tts,

I've been working on a local-first audio production tool built on top of IndexTTS2.

It includes a Script Canvas for structuring scenes + emotion detection, a Voice Studio for reusable characters, and a timeline for mixing takes with SFX/ambience.

Still early beta and Docker-based (NVIDIA GPU). Curious if anyone here is interested in this kind of workflow tool.

Repo: https://github.com/JaySpiffy/IndexTTS-Workflow-Studio

(Old prototype code is on the legacy-v2 branch)

u/AdministrativeFlow68 — 8 days ago

I’ve been working on my local TTS workflow tool and just released a big evolution. The repo you may have seen (IndexTTS-Workflow-Studio) now hosts Draft to Take Beta — a local-first AI audio production studio.

What’s new / key features

  • Script Canvas for writing + emotion detection + speaker assignment
  • Built-in timeline for reviewing takes and exporting mixes
  • Voice Studio for reusable voices (OmniVoice)
  • Powered by IndexTTS2 + Qwen sidecar + optional SFX/Music
  • Easy Docker launcher (start.bat on Windows + NVIDIA)

Quick start

  1. Docker Desktop running → Download repo as ZIP
  2. Extract + run start.bat
  3. Open localhost:3000

Full details + requirements here: https://github.com/JaySpiffy/IndexTTS-Workflow-Studio

Old prototype code is preserved on the legacy-v2 branch.

Call to action
Looking for early testers with NVIDIA GPUs (12GB+ VRAM preferred). Feedback on workflow, bugs, and feature requests very welcome!

u/AdministrativeFlow68 — 9 days ago