If you ever tried starting a faceless channel, you already know the biggest pain point. Doing it well is a massive grind.
Right now, creating just one solid video forces you to jump between a messy stack of tools:
- ChatGPT to draft the script.
- Midjourney to generate the images.
- ElevenLabs to generate the voiceover.
- Premiere to painstakingly stitch the audio, images, and captions together.
This whole process can take hours. I wanted to solve the tool-jumping problem, but I completely refused to build a slop machine.
So, I built a platform that combines the entire workflow into one dashboard, centered entirely around Human-in-the-Loop automation. It eliminates the tedious tab-switching so you can focus your time on what actually matters: writing a killer script and curating high-quality images.
Here is how the workflow is broken down:
- Input your source: Drop in a YouTube link or an article to set the foundation.
- Refine the script (Human-in-the-Loop): The AI drafts the initial script, but you get a chat interface to review, revise, and tweak it so it sounds authentic, not robotic.
- Customize your brand: Generic AI art is played out. You can upload a custom avatar to build a real brand identity.
- Generate and curate media: The tool automatically generates your background images and syncs them perfectly with ElevenLabs TTS. If an image hallucinates, just hit regenerate until it meets your quality standards.
- Export and publish: The platform stitches the audio, visuals, and captions together. Just download the final video and upload it to your platforms.
By combining the tool stack, you get the consistency of an automated pipeline, but the quality control of manual editing.
Try it out here https://shortsgen-web.vercel.app/