u/Healty_potsmoker

I keep seeing posts comparing gemini to chatgpt on text and coding and I feel like everyone is sleeping on the thing gemini actually does better than anything else which is understanding images and visual content.

I do video and design work and about a month ago I started using gemini for something specific, I'll take a screenshot of a video frame or a design comp or a visual reference I found online and ask gemini to analyze the composition and color palette and lighting and mood and tell me how to recreate or riff on that visual style.

and it's genuinely incredible at this, like I showed it a frame from a wes anderson film and asked it to break down exactly what makes it feel like a wes anderson shot and it identified the specific color relationships and the symmetry and the depth of field choices and the prop placement in a way that was actually useful to me as someone trying to achieve a similar feel, chatgpt gave me generic film school stuff when I tried the same thing and claude just described what was in the image without the compositional analysis.

where this has become really practical for me is in my actual production workflow, I'll generate visual concepts using midjourney, run style references through magic hour and runway to test different looks in motion, and then when I need to understand why a certain reference image or video frame works I bring it to gemini because it can articulate the visual principles in a way I can actually apply to my own work.

it's become this weird thing where gemini isn't the tool I use to create anything but it's the tool that makes me better at using every other tool because it helps me see what I'm looking at more precisely.

the other thing it does that I haven't been able to replicate anywhere else is comparing two images and telling me specifically what's different about them compositionally not just content wise, like I'll show it two versions of the same shot with different color grading and it'll tell me exactly how the warm tones in version A create intimacy while the cooler tones in version B create distance and why that's happening technically.

has anyone else found gemini's visual analysis to be way ahead of the other models or am I just not prompting chatgpt and claude correctly for this kind of thing

I keep seeing people here build for months before talking to a single customer and I want to share what worked for me because I think it'll save some of you a lot of wasted time

I had an idea for a scheduling tool for trades businesses (plumbers, electricians, HVAC), I was ready to start building but a friend who's built and sold 2 SaaS companies told me if you can't get 10 people to say they'd pay for this before it exists you shouldn't build it

so instead of opening my code editor I opened my outreach tool

step1: I built a list of 200 trades business owners in my state using fuseai for the contact enrichment, filtered for companies with 5-20 employees because that's my hypothesized sweet spot, too big for pen and paper scheduling but too small for enterprise software.

step2: I wrote one email that described the problem I thought they had, asked if it resonated and offered a 15 minute call to hear about their workflow, no product pitch and mockups, just "I'm researching how trades businesses handle scheduling and I'd love 15 minutes of your time".

step3: sent 200 emails over 2 weeks

results: 23 replies, 14 calls booked, 11 actually showed up

what I learned from those 11 calls: 7 of them confirmed the scheduling problem was real but 9 of 11 said their ACTUAL biggest pain point was something I hadn't even considered which is quoting and invoicing and not scheduling, the scheduling was annoying but the quoting process was costing them actual money in lost jobs

if I'd built the scheduling tool I would have built the wrong thing, those 11 calls in 2 weeks saved me probably 4-6 months of building something nobody would have paid for

I'm now building a quoting tool instead and I have 6 of those 11 people signed up for a pilot when it's ready, they literally told me what to build and volunteered to be my first customers

please please please talk to customers before you build, cold outreach is the fastest way to have real conversations with real potential users and it costs almost nothing compared to building the wrong thing

what's your validation approach and how many conversations did you have before you started building

gemini's image understanding is so far ahead of everything else I've used and I don't see enough people talking about it

how I validated my microsaas idea in 2 weeks by just cold emailing potential users before writing a single line of code