u/Bulky-Resolution6265

our agency had been running Meta for over a year. ROAS was a stable 2.3x.

we had a kind of relationship where you can't really fire them but also can't brag about them at any dinner party that doesn't involve other people who also pay agencies.

every monthly call sounded identical. "we're testing new angles." "the algorithm needs more time." "let's allocate more to creative."

after month 11 I started suspecting their creative process was three guys in a Slack channel typing "what if we made the headline shorter."

https://preview.redd.it/xe77mq1n7pzg1.png?width=926&format=png&auto=webp&s=d150de909379f31ce3d0d86ede4ad43c328f152b

https://preview.redd.it/obr0ku1n7pzg1.png?width=926&format=png&auto=webp&s=a17f932ade6d0a7ee92b00beb61efb6a980db678

Meanwhile, our reviews tab on Shopify had been quietly collecting dust since 2022. 612 reviews sitting there. Mostly 5-star. Some 3-star.

A few absolute roasts I'd been pretending didn't exist. I'd never opened the dashboard for anything other than embedding the rating widget on PDPs.

Then I had the dumb-but-correct thought that probably saved us $8k/month: my customers have already written better ad copy than my agency. I just need to find it.

The setup (took me 22 minutes)

Exported all 612 reviews from Judge. me as a CSV. You can do the same on Loox, Stamped, or Yotpo. Cleaned it in Excel. Added columns for star rating, product variant, review length in words, and date posted. The length column matters more than people think. I'll come back to that.

https://preview.redd.it/e8yib5ikapzg1.png?width=850&format=png&auto=webp&s=acb9ee05cc32348b8299b6ba0ef16d0b2d55b23a

Then I started feeding them to Claude in batches of ~80. This is where most people screw it up. If you prompt with "find patterns in these reviews," you get summary-level slop.

"Customers liked the product." "Many mentioned good quality." Cool, useless, thanks.

The prompt has to force Claude to extract specific language in five categories:

  1. Pain language - what was their life like before they bought? Use the customer's exact words, not paraphrased.
  2. Transformation language - what changed after using it? Again, exact phrases. Grammar mistakes preserved.
  3. Objection language - anything that sounded like "I almost didn't buy because..." or "I was worried that..." This is the goldmine.
  4. Unexpected use cases - anyone using the product for something we never marketed it for?
  5. Comparative language - mentions of a competitor, a previous solution, or "I've tried everything else."

The non-obvious bit nobody tells you: the gold isn't in 5-star reviews. Five-star reviews mostly say "love it!!" and tell you nothing usable.

The gold is in long 4-star reviews where someone explains their entire journey, and 3-star reviews where they're slightly disappointed and accidentally tell you exactly which promise didn't land.

That's why the length column matters. Sort by word count descending, work through the top 20% first. That's 80% of your insights.

https://preview.redd.it/8kduac6scpzg1.png?width=883&format=png&auto=webp&s=edf6b4e01932f57b988d420ab5d54f2302e3741e

Three things this surfaced that actually changed our ad account

Finding 1: An objection literally nobody on the agency side had addressed.

47 reviews mentioned some variation of "I almost didn't buy because I thought it would be too [specific concern]."

Our existing ads were structured to confirm the concern and then overcome it ("yes it looks heavy, but actually...").

We rewrote the headlines to flip the objection into a benefit instead. New variant ran at a 31% lower CPA than the agency's best ad. [SCREENSHOT: side-by-side ad creative + performance numbers]

Finding 2: A use case we had never sold against.

19 reviews mentioned using the product during a specific seasonal moment. We had been marketing it as an everyday item for two years. Built one ad set around the seasonal angle.

Within three weeks it became our highest ROAS creative in the account. The agency had access to the same reviews. They never opened them.

Finding 3: The phrase that became our hook.

The phrase "I didn't think it would actually [X]" appeared in 31 reviews. Almost word-for-word. We dropped it as the opening line of a UGC-style ad.

Best-performing creative we have ever run. People in the comments started replying "this is exactly what I thought too."

https://preview.redd.it/6dqyvwx7epzg1.png?width=446&format=png&auto=webp&s=c5b5360bad37b6c30c47fe99be9f384363c8ea84

Things to watch out for if you actually try this

Don't feed Claude reviews from multiple SKUs in one batch. The patterns blur into mush. Run each product separately even if it triples your time.

Claude will sometimes "clean up" customer language to sound more polished. Add to your prompt: "preserve exact phrasing including grammatical errors, slang, and unusual word choices." The authenticity lives in the awkward sentences.

Cross-check anything that surprises you. If Claude tells you "47 reviews mentioned X," ask it to quote five of them with row numbers.

Catches hallucination in 10 seconds. Skip this step and you'll write an ad around a phrase that doesn't exist.

100 reviews is the realistic minimum for patterns to emerge. Below that you're looking at individual feedback, not patterns. Above 1,000 you start getting diminishing returns and should batch by date instead.

The whole workflow takes ~30 minutes once you have the prompt chain dialed in. I run it monthly now on the new reviews from that month.

It's basically a free creative strategist who has read every customer testimonial you've ever received and remembers them all.

We didn't renew the agency contract last month. Not because they were bad. Because a $20/month Claude sub plus a CSV export was doing their job better than they were.

If you want the full review-mining system I built...

(the extraction prompts, the tagging template, the 30-min workflow doc, and sample outputs from 4 different shopify brands so you can see what good looks like before you start), let me know in the comments and I'll share the link with you.

reddit.com
u/Bulky-Resolution6265 — 7 days ago

so last year i was running creative for a skincare brand doing about $400k/month and our hook win rate was sitting at a depressing 3 percent.

we were doing the textbook ai ugc playbook. heygen avatar, gpt script, "have you ever wondered why your skin..." cue the entire planet scrolling past.

i was about to fire myself when i noticed something weird in the comments under our competitors' ads. people were typing entire essays about why they hated the product.

genuinely angry, very specific, very long. and i thought wait. THAT is the language i want in my scripts. not the sanitized 5 star marketing speak. the angry, specific, "the pump broke on day 6" energy.

so we scraped everything. 14,000 reviews across our top 4 competitors. trustpilot, amazon, reddit threads, the comment sections of their meta ads. went home for the weekend. came back monday with a hook win rate of 22 percent.

here is what nobody tells you about this play.

1 stars are not where the gold is

i know the title says 1 star. i lied for the hook. sorry. pure 1 stars are mostly shipping complaints, broken packaging, and people who clearly bought the wrong product. useful for logistics, useless for creative.

the actual goldmine is 2 and 3 star reviews. these are people who WANTED to love the product, gave it a real shot, and walked away conflicted. that mixed feeling is the most valuable creative asset on the internet because it mirrors the exact internal state of your target customer the moment before they buy.

read the line "i really wanted this to work but..." and tell me that is not a hook.

scrape competitors, not yourself

your own 1 stars tell you what to fix. your competitors' 1 stars tell you what to sell.

every unhappy customer of your biggest competitor is sitting in your TAM right now, googling alternatives. their complaint is your positioning. if everyone hates that competitor X has a chalky aftertaste, your hook writes itself: "i tried 4 greens powders and almost gave up because of the chalky aftertaste until..."

we pulled 1 to 3 star reviews from the top 3 competitors in our category and built a spreadsheet of every recurring complaint. 70 percent overlap across brands. those overlapping complaints are category-level objections. solving them in your hook is basically free conversion.

the source hierarchy nobody respects

reddit > trustpilot > amazon > your own shopify reviews.

reddit has the most natural language because nobody is performing for a brand. trustpilot users are angry and specific (anger produces detail). amazon is gamed to hell but the long reviews are usually real. your own shopify reviews are sanitized because customers know the brand is reading them.

search reddit specifically with: site:reddit.com "[product category]" "waste of money", or "almost returned", or "wish i had known". those phrases pull up the exact emotional posture you want your avatar to open with.

stop pasting reviews into chatgpt

this is where 90 percent of people screw it up. they dump 50 reviews into a prompt and say "write me a ugc script." the model averages everything and hands you back a beige paragraph of "many customers report..." which is exactly the ai-feeling slop you were trying to escape.

what actually works:

step 1. cluster complaints into 5 to 7 themes. (texture, smell, expectation mismatch, packaging, results timeline, price, customer service.) do this manually. takes 40 minutes.

step 2. for each theme, extract 10 to 15 verbatim phrases of 3 to 7 words. "smelled like wet cardboard." "stopped working after week 2." "i thought i was getting scammed." build a phrase bank doc.

step 3. prompt the model with ONE theme at a time, and require it to use 3 phrases from the phrase bank verbatim. the model can paraphrase the structure but the phrases stay untouched.

this single change took our scripts from "ai but trying" to "wait was that a real person."

the 4 beat skeleton that converts

every script we ship follows the same structure.

beat 1, the doubt: open with the complaint as if it is your own. "i almost returned this on day 3 because it smelled like wet cardboard." yes use the verbatim phrase. yes it is technically a competitor's complaint. nobody is fact checking your meta ad.

beat 2, the trigger: why you stuck with it anyway. usually a specific person or moment. "but my sister kept saying give it 2 weeks."

beat 3, the turn: the moment something shifted. specific day, specific observation. "day 11 i caught my reflection and..."

beat 4, the new normal: understated. never "life changing." always specific. "i stopped buying the $80 one."

why this annihilates 5 star review mining

5 star reviews use post-conviction language. "amazing!" "obsessed!" "life changing!" useless. these are words people use AFTER they have been sold. your prospect is not sold yet. they are skeptical, tired, and have been burned before.

complaints use sensory, granular, specific language. "clumped on day 4." "cap doesn't close right." "the pump only works upside down." specificity reads as truth. truth converts.

an ai avatar saying "i love this product" gets clocked as fake in 0.8 seconds. an ai avatar saying "the cap doesn't close right and i don't even care anymore because..." gets watched to completion because viewers are now genuinely confused about whether this is even an ad.

two warnings

do not name competitors. rephrase the complaint structurally, not attributively. "the other one i tried" beats "[brand name]" every time, and saves you a cease and desist your lawyer will charge you $4k to ignore.

do not use this play for supplements making health claims. meta will eat your account for breakfast no matter how authentic the script sounds.

reddit.com
u/Bulky-Resolution6265 — 10 days ago

some context: shopify store, low 7 figures, mostly DTC, 3 years old.
we thought we knew our customer. we had personas in notion.

we had "the gym girl," "the busy mom," "the gift buyer." we'd been targeting them for 2 years. decent returns.

two sundays ago I was in the back end pulling an export for our accountant and noticed something dumb: every order has order tags (the little labels that auto-fire from shopify rules and apps), customer notes (what people type in the "anything we should know?" field), and shipping instructions (where they tell the driver where to leave it).

I'd never looked at any of these as a dataset. We just used them for ops.

so out of curiosity I exported 3 years of orders with those fields included. ~28,400 rows.

threw it in claude. asked it: "are there clusters in here we're not seeing?"

here's the whole thing that came back - the export, the mistakes, what worked, and what we changed.

https://preview.redd.it/y3n89ebezwxg1.png?width=1358&format=png&auto=webp&s=f9c4086612555876763074c5aed28898eb1897ef

the 11 fields that moved the needle

this is the thing nobody tells you.

when you go to Shopify → Orders → Export, the default export gives you order ID, date, customer email, line items, total, shipping address, payment status - about 6 fields of basically nothing.

that's what every "shopify analytics" tutorial uses.

The fields where the actual signal lives are buried:

  1. order tags (auto-fires from rules: high-AOV, returning customer, gift, subscription, etc.)
  2. customer tags (manual + automated tagging built up over years)
  3. note from customer (the freeform field at checkout - this is goldmine)
  4. note attributes (hidden custom fields, often from apps)
  5. shipping instructions (where they tell drivers, full of context)
  6. discount code used (which influencer/campaign/segment they came in through)
  7. landing site (the URL they first hit - SEO post vs. ad vs. direct)
  8. referring site (where they came from)
  9. source name (web vs. POS vs. draft order vs. subscription)
  10. cancel reason (the dropdown they picked when they bailed)
  11. refund line item reasons (why specifically they returned)

default Shopify export hides about half of these.

you have to either build a custom export or use the Bulk Export feature with custom fields. it took me 20 minutes the first time to figure out which checkboxes mattered. Worth every second.

https://preview.redd.it/hvyj6gxk0xxg1.png?width=1400&format=png&auto=webp&s=4d8dbcd25602f203fcbb65f8b5683a7d99965793

mistake you should avoid

I tried to do it in one prompt

first attempt: I dumped all 28,400 rows into Claude and asked "find me hidden customer segments."

it tried. but output was generic ("here are 4 segments: high-spenders, casual buyers, gift-givers, returners"). useless.

why it failed? with too much data and too vague a prompt, claude defaults to the most common, least insightful frame. you have to constrain it.

here's a 4-prompt chain

each prompt does one thing. each builds on the last. order matters.

prompt 1 - field-by-field anomaly detection.
i told claude to ignore segmentation entirely and just go field by field, telling me which fields had the most variance, the most unexpected values, and the most unstructured information.
this pre-step is what most people skip. It tells you which fields to actually segment on before you try to segment.

the output flagged customer notes and shipping instructions as the two fields with the most signal - way more than tags or discount codes. we'd have spent 4 hours segmenting on tags and gotten nowhere.

Prompt 2 - cluster the unstructured fields.
i asked Claude to read the customer notes (only the customer notes - nothing else) and tell me what the top 15 reasons for buying actually were, in the customers' own words. not categories I gave it. categories it found.

this is where it got weird. around 2,800 of our orders had customer notes that fell into a pattern none of our personas covered. i'll show you what it looked like.

https://preview.redd.it/qiraovhx1xxg1.png?width=1280&format=png&auto=webp&s=17b87eee72c576b6cf3886c91c00eb594915e8e5

prompt 3 - cross-reference the cluster with everything else.
once we had the cluster, i asked Claude to pull the order tags, AOV, repeat rate, discount usage, landing pages, and shipping addresses only for orders inside that cluster, and tell me how this group differed from everyone else.

the differences were absurd:

  • 2.4x higher AOV than our average customer
  • 71% repeat rate inside 12 months (our overall is 34%)
  • almost zero discount code usage (they were full-price buyers)
  • disproportionately landing on one specific blog post we'd written 2 years ago and forgotten about
  • shipping addresses skewed heavily toward different ZIP codes than our main customer base

this was an entire customer segment, hiding in plain sight, that our ads had never spoken to once.

prompt 4 - translate the cluster into operational moves.
last prompt: "given this segment, what would you change about ad creative, email flows, the home page, the product page, and the post-purchase flow?"
claude gave me 14 specific moves. we did 9 of them. (the other 5 felt like overreach - claude will sometimes suggest stuff that's smart but not worth the engineering lift, and you have to filter.)

what we changed?

i don't want to overshare on the segment specifics (it's a competitive industry), but the high-level moves:

  • wrote 3 new ad angles speaking to the buyer. different motivation entirely. people buying for someone else have a fundamentally different decision process than people buying for themselves, and we'd been writing every ad as if they were the same person.
  • built a "gift mode" toggle on the PDP with different copy, different reviews surfaced, different shipping promise.
  • added a post-purchase flow specifically for this segment that included a "how to introduce this to someone who's resistant to new things" guide. this one was Claude's idea. it's now our highest-engagement post-purchase email by a mile.
  • reweighted our blog content calendar. that one forgotten post we'd written 2 years ago? turns out it was doing 6-figures of attributed revenue annually with zero promotion. we wrote 3 more in the same vein. they're already our top-3 organic landing pages.

six weeks in, segment-attributed revenue is up about 31% on a base that was already sizable.

the stuff people will ask in comments, answered upfront

doesn't this require a ton of data? No. We had 28,400 orders. I've since done the same exercise on a smaller brand of a friend's with ~4,000 orders.

worked just as well. the pattern recognition doesn't need scale, it needs unstructured field coverage - meaning customers actually have to be writing notes.
if your store doesn't surface a "note from customer" field at checkout, turn it on right now. you're leaving free signal on the table.

will claude leak my customer data?
not a lawyer, but: I anonymized the export first - stripped names, emails, exact addresses (kept ZIP), phone numbers, full street addresses. took 5 minutes with Find/Replace. Do this. Don't be a dummy.

why not just use a CDP?
because every CDP I've used segments on the fields you've told it to care about. the whole point of this exercise was finding the segment we didn't know to ask about. CDPs are great at scaling segments you already know exist. They're useless at discovering new ones.

why Claude over GPT?
long context. I needed to paste a lot of data and reason across all of it at once.
GPT chokes earlier on this kind of file. I tested both. claude held the full set; GPT made me chunk it and lost the cross-field signal between chunks. YMMV but for this specific job, Claude wins.

where do I find the customer note field on Shopify?
settings → checkout → "add a notes field at checkout."
It's off by default. Turn it on. Even if you don't do this whole exercise, you'll get free customer language to mine for ad copy.

I'm still kicking myself about this

this data had been in my Shopify backend for 3 years. three years. we pay for an analytics tool. we have an agency. we have a CRM. none of them surfaced this segment.

because none of them were looking at the unstructured fields - they all live downstream of structured tags and predefined events.

the unstructured fields - customer notes, shipping instructions, refund reasons typed in freeform - are where your customers tell you who they actually are.

most stores never read them. the single best thing you can do this week, even if you don't run this exact playbook, is open a random sample of 200 customer notes from the last 90 days and just read them. You'll find at least one thing that changes how you write ads.

if you want the full resource

the shopify custom export setup with all 11 fields (and the 3 ways shopify hides them), the 4 prompts in their long-form, the anonymization checklist, and 3 worked examples I did across an apparel brand, a wellness brand, and a kitchenware brand (all with the actual ghost segments they found, anonymized) - let me know, i'll share the access.

took me a couple of weekends to put together. sharing it because writing it up was helpful for me, and 28,400 rows of orders is a lot to waste on the accountant.

reddit.com
u/Bulky-Resolution6265 — 16 days ago