I kept losing entire days jumping between Midjourney, Runway, Kling and like 8 browser tabs every time I made a video ad mock. Same prompt copied everywhere. Half my time spent stitching, not creating.
So I tried something different.
A node-based canvas via Higgsfield AI where prompts, images and video models live as connected nodes on one board. A prompt feeds an image. The image feeds a video. Everything visible at once.
Test brief: a 6-shot perfume ad. One protagonist. Different locations. Same character across all shots.
The setup:
- 2 image generation nodes (character anchor + location anchor)
- 6 video generation nodes for the actual shots
- Each video pulls from the image anchors as reference
The whole thing took about an hour.
Character consistency held across all 6 shots without retraining. Running 4 video models in parallel let me pick the best output in 10 minutes flat.
The thing that almost broke it: video models grab the reference frame literally. First frame of every video is basically a clone of the input image. If your anchor has the character standing under a streetlight, every shot starts with her standing under a streetlight — even when the prompt says "she's already walking."
You have to prompt around it ("camera starts mid-motion") or generate a different anchor for that beat. Took me 3 generations to figure out and nobody documents this.
It's still early but the goal is simple: stop losing half my day to tab-switching.
Going to test how it holds up past 10+ shots next.