u/SilverConsistent9222

claude skills description field is what actually determines if your skill works or not

been using claude skills for a while now and a few things tripped me up that i didn't see mentioned anywhere so putting them here.

the description field is everything. i kept building skills that weren't triggering and every single time it came back to a vague description. claude reads that field to decide whether to load the skill or not. if it's too generic it never fires, if it's too broad it fires when you don't want it to. i spent way more time than i should have tweaking the actual instructions when the real problem was one sentence at the top.

there's also a 200 character limit on that field. roughly two sentences. if you don't know it exists you'll write something longer, it gets cut off silently, and the skill behaves unpredictably.

a few other things worth knowing:

if your skill isn't triggering after upload, check if code execution is enabled in settings. custom skills need it on. wasted time debugging a perfectly fine skill because of this.

disable-model-invocation in the frontmatter does nothing on Claude AI web interface. it's claude code only. if you add it thinking it'll stop auto-triggering on the web it just silently ignores it.

when zipping the skill, zip the folder not the contents. loose Skill MD at the zip root doesn't work. the folder needs to wrap it.

and skills vs projects, worth being clear on before you start building. skills load automatically across every conversation. projects are scoped to one ongoing context. people mix these up and then wonder why behavior is inconsistent.

reddit.com
u/SilverConsistent9222 — 4 hours ago

claude skills description field is what actually determines if your skill works or not

been using claude skills for a while now and a few things tripped me up that i didn't see mentioned anywhere so putting them here.

the description field is everything. i kept building skills that weren't triggering and every single time it came back to a vague description. claude reads that field to decide whether to load the skill or not. if it's too generic it never fires, if it's too broad it fires when you don't want it to. i spent way more time than i should have tweaking the actual instructions when the real problem was one sentence at the top.

there's also a 200 character limit on that field. roughly two sentences. if you don't know it exists you'll write something longer, it gets cut off silently, and the skill behaves unpredictably.

a few other things worth knowing:

if your skill isn't triggering after upload, check if code execution is enabled in settings. custom skills need it on. wasted time debugging a perfectly fine skill because of this.

disable-model-invocation in the frontmatter does nothing on Claude AI web interface. it's claude code only. if you add it thinking it'll stop auto-triggering on the web it just silently ignores it.

when zipping the skill, zip the folder not the contents. loose Skill MD at the zip root doesn't work. the folder needs to wrap it.

and skills vs projects, worth being clear on before you start building. skills load automatically across every conversation. projects are scoped to one ongoing context. people mix these up and then wonder why behavior is inconsistent.

reddit.com
u/SilverConsistent9222 — 4 hours ago

claude skills description field is what actually determines if your skill works or not

been using claude skills for a while now and a few things tripped me up that i didn't see mentioned anywhere so putting them here.

the description field is everything. i kept building skills that weren't triggering and every single time it came back to a vague description. claude reads that field to decide whether to load the skill or not. if it's too generic it never fires, if it's too broad it fires when you don't want it to. i spent way more time than i should have tweaking the actual instructions when the real problem was one sentence at the top.

there's also a 200 character limit on that field. roughly two sentences. if you don't know it exists you'll write something longer, it gets cut off silently, and the skill behaves unpredictably.

a few other things worth knowing:

if your skill isn't triggering after upload, check if code execution is enabled in settings. custom skills need it on. wasted time debugging a perfectly fine skill because of this.

disable-model-invocation in the frontmatter does nothing on Claude AI web interface. it's claude code only. if you add it thinking it'll stop auto-triggering on the web it just silently ignores it.

when zipping the skill, zip the folder not the contents. loose Skill MD at the zip root doesn't work. the folder needs to wrap it.

and skills vs projects, worth being clear on before you start building. skills load automatically across every conversation. projects are scoped to one ongoing context. people mix these up and then wonder why behavior is inconsistent.

reddit.com
u/SilverConsistent9222 — 4 hours ago

claude skills description field is what actually determines if your skill works or not

been using claude skills for a while now and a few things tripped me up that i didn't see mentioned anywhere so putting them here.

the description field is everything. i kept building skills that weren't triggering and every single time it came back to a vague description. claude reads that field to decide whether to load the skill or not. if it's too generic it never fires, if it's too broad it fires when you don't want it to. i spent way more time than i should have tweaking the actual instructions when the real problem was one sentence at the top.

there's also a 200 character limit on that field. roughly two sentences. if you don't know it exists you'll write something longer, it gets cut off silently, and the skill behaves unpredictably.

a few other things worth knowing:

if your skill isn't triggering after upload, check if code execution is enabled in settings. custom skills need it on. wasted time debugging a perfectly fine skill because of this.

disable-model-invocation in the frontmatter does nothing on Claude AI web interface. it's claude code only. if you add it thinking it'll stop auto-triggering on the web it just silently ignores it.

when zipping the skill, zip the folder not the contents. loose Skill MD at the zip root doesn't work. the folder needs to wrap it.

and skills vs projects, worth being clear on before you start building. skills load automatically across every conversation. projects are scoped to one ongoing context. people mix these up and then wonder why behavior is inconsistent.

reddit.com

claude skills description field is what actually determines if your skill works or not

been using claude skills for a while now and a few things tripped me up that i didn't see mentioned anywhere so putting them here.

the description field is everything. i kept building skills that weren't triggering and every single time it came back to a vague description. claude reads that field to decide whether to load the skill or not. if it's too generic it never fires, if it's too broad it fires when you don't want it to. i spent way more time than i should have tweaking the actual instructions when the real problem was one sentence at the top.

there's also a 200 character limit on that field. roughly two sentences. if you don't know it exists you'll write something longer, it gets cut off silently, and the skill behaves unpredictably.

a few other things worth knowing:

if your skill isn't triggering after upload, check if code execution is enabled in settings. custom skills need it on. wasted time debugging a perfectly fine skill because of this.

disable-model-invocation in the frontmatter does nothing on Claude AI web interface. it's claude code only. if you add it thinking it'll stop auto-triggering on the web it just silently ignores it.

when zipping the skill, zip the folder not the contents. loose Skill MD at the zip root doesn't work. the folder needs to wrap it.

and skills vs projects, worth being clear on before you start building. skills load automatically across every conversation. projects are scoped to one ongoing context. people mix these up and then wonder why behavior is inconsistent.

reddit.com
▲ 42 r/Rag

why does everyone skip the chunking part

every RAG tutorial i've seen spends 80% of the time on vector databases and embeddings and then says "chunk your documents" like it's obvious and moves on.

it's not obvious. it's actually the thing that breaks most implementations.

fixed size chunking splits wherever the token limit hits. doesn't care about sentence boundaries, doesn't care if two sentences only make sense together. you end up retrieving half a thought and the model fills in the rest, confidently, which is the whole problem you were trying to solve.

sliding window with overlap is what most people actually use in production and it's fine, but the real thing that helped me was just reading what was actually getting retrieved for failed queries instead of assuming the pipeline was working. almost always the chunk was on the right topic but missing the sentence that contained the actual answer.

the other thing, vector search breaks on exact identifiers. someone asks about a specific model number or product code, semantic search returns "close enough" results. close enough is wrong. hybrid search with BM25 alongside vectors handles this but it never shows up in the intro tutorials so you find out the hard way.

and stale index. you update a document, don't re-index, user gets a confidently wrong answer. it's not a technical problem it's a pipeline problem which is probably why nobody writes about it.

curious what others are doing for re-indexing, currently on a schedule and it works but feels fragile.

reddit.com
u/SilverConsistent9222 — 2 days ago

why does everyone skip the chunking part

every RAG tutorial i've seen spends 80% of the time on vector databases and embeddings and then says "chunk your documents" like it's obvious and moves on.

it's not obvious. it's actually the thing that breaks most implementations.

fixed size chunking splits wherever the token limit hits. doesn't care about sentence boundaries, doesn't care if two sentences only make sense together. you end up retrieving half a thought and the model fills in the rest, confidently, which is the whole problem you were trying to solve.

sliding window with overlap is what most people actually use in production and it's fine, but the real thing that helped me was just reading what was actually getting retrieved for failed queries instead of assuming the pipeline was working. almost always the chunk was on the right topic but missing the sentence that contained the actual answer.

the other thing, vector search breaks on exact identifiers. someone asks about a specific model number or product code, semantic search returns "close enough" results. close enough is wrong. hybrid search with BM25 alongside vectors handles this but it never shows up in the intro tutorials so you find out the hard way.

and stale index. you update a document, don't re-index, user gets a confidently wrong answer. it's not a technical problem it's a pipeline problem which is probably why nobody writes about it.

curious what others are doing for re-indexing, currently on a schedule and it works but feels fragile.

reddit.com
u/SilverConsistent9222 — 2 days ago

some things i learned the hard way using claude design

been using claude design for a few weeks now and figured i'd dump some notes here before i forget. nothing groundbreaking, just stuff that took me way too long to figure out on my own.

first thing nobody tells you: do the design system setup BEFORE you build anything. i spent my first session prompting "build me a landing page for X" and got the most generic ai-looking output you can imagine. then i actually uploaded some brand stuff, let it extract tokens, approved them, and suddenly everything after that looked... like a real product? same prompts, totally different result. the docs say this but i skimmed past it like an idiot.

second thing. it eats tokens. like, a lot. it's on a separate weekly budget from regular claude chat and claude code which is nice in theory but if you're regenerating stuff over and over in chat you'll burn through it. the refine controls (inline comments, direct text edits, sliders) use way less than re-prompting. once i started using those for small fixes instead of typing "actually can you make the padding bigger" in chat, my budget lasted way longer. i'm on max 20x and it's mostly fine, on the $20 plan you'll feel it fast.

also re: animations. they're live react components running in the browser, not video files. You can download standalone html file and upload to claude2video it will generate mp4 video from that.

honest take on where it fits in the landscape since people always ask: it's not killing figma. figma is still better for any real design team workflow, devmode, multi-person collab. v0 and lovable are still better if you want to skip design entirely and just spin up an mvp with auth and a db. where this thing wins is the loop from "i have an idea" to "working prototype" to "claude code builds the actual app from it". the design system carrying through to the shipped code is the part that's genuinely different.

if you're a solo founder or pm or someone who keeps getting stuck between figma mockups and a real thing you can show people, worth learning. if you have a design team and a real component library already, probably overkill.

it's a research preview btw so half of this might be wrong in two months.

https://preview.redd.it/7ji8hv4nim1h1.jpg?width=1024&format=pjpg&auto=webp&s=b26579431bc04da562602795ef96f1972b7e7dc1

reddit.com
u/SilverConsistent9222 — 3 days ago

some things i learned the hard way using claude design

been using claude design for a few weeks now and figured i'd dump some notes here before i forget. nothing groundbreaking, just stuff that took me way too long to figure out on my own.

first thing nobody tells you: do the design system setup BEFORE you build anything. i spent my first session prompting "build me a landing page for X" and got the most generic ai-looking output you can imagine. then i actually uploaded some brand stuff, let it extract tokens, approved them, and suddenly everything after that looked... like a real product? same prompts, totally different result. the docs say this but i skimmed past it like an idiot.

second thing. it eats tokens. like, a lot. it's on a separate weekly budget from regular claude chat and claude code which is nice in theory but if you're regenerating stuff over and over in chat you'll burn through it. the refine controls (inline comments, direct text edits, sliders) use way less than re-prompting. once i started using those for small fixes instead of typing "actually can you make the padding bigger" in chat, my budget lasted way longer. i'm on max 20x and it's mostly fine, on the $20 plan you'll feel it fast.

also re: animations. they're live react components running in the browser, not video files. You can download standalone html file and upload to claude2video it will generate mp4 video from that.

honest take on where it fits in the landscape since people always ask: it's not killing figma. figma is still better for any real design team workflow, devmode, multi-person collab. v0 and lovable are still better if you want to skip design entirely and just spin up an mvp with auth and a db. where this thing wins is the loop from "i have an idea" to "working prototype" to "claude code builds the actual app from it". the design system carrying through to the shipped code is the part that's genuinely different.

if you're a solo founder or pm or someone who keeps getting stuck between figma mockups and a real thing you can show people, worth learning. if you have a design team and a real component library already, probably overkill.

it's a research preview btw so half of this might be wrong in two months.

https://preview.redd.it/je4hx3lshm1h1.jpg?width=1024&format=pjpg&auto=webp&s=0bc7500711e5c75ceb0a639f0c64faec9e51e560

reddit.com
u/SilverConsistent9222 — 3 days ago

some things i learned the hard way using claude design

been using claude design for a few weeks now and figured i'd dump some notes here before i forget. nothing groundbreaking, just stuff that took me way too long to figure out on my own.

first thing nobody tells you: do the design system setup BEFORE you build anything. i spent my first session prompting "build me a landing page for X" and got the most generic ai-looking output you can imagine. then i actually uploaded some brand stuff, let it extract tokens, approved them, and suddenly everything after that looked... like a real product? same prompts, totally different result. the docs say this but i skimmed past it like an idiot.

second thing. it eats tokens. like, a lot. it's on a separate weekly budget from regular claude chat and claude code which is nice in theory but if you're regenerating stuff over and over in chat you'll burn through it. the refine controls (inline comments, direct text edits, sliders) use way less than re-prompting. once i started using those for small fixes instead of typing "actually can you make the padding bigger" in chat, my budget lasted way longer. i'm on max 20x and it's mostly fine, on the $20 plan you'll feel it fast.

also re: animations. they're live react components running in the browser, not video files. You can download standalone html file and upload to claude2video it will generate mp4 video from that.

honest take on where it fits in the landscape since people always ask: it's not killing figma. figma is still better for any real design team workflow, devmode, multi-person collab. v0 and lovable are still better if you want to skip design entirely and just spin up an mvp with auth and a db. where this thing wins is the loop from "i have an idea" to "working prototype" to "claude code builds the actual app from it". the design system carrying through to the shipped code is the part that's genuinely different.

if you're a solo founder or pm or someone who keeps getting stuck between figma mockups and a real thing you can show people, worth learning. if you have a design team and a real component library already, probably overkill.

it's a research preview btw so half of this might be wrong in two months.

https://preview.redd.it/epniamv6hm1h1.jpg?width=1024&format=pjpg&auto=webp&s=3d651f560ee004f4e513d758e717a5a264c85171

reddit.com
u/SilverConsistent9222 — 3 days ago

some things i learned the hard way using claude design

been using claude design for a few weeks now and figured i'd dump some notes here before i forget. nothing groundbreaking, just stuff that took me way too long to figure out on my own.

first thing nobody tells you: do the design system setup BEFORE you build anything. i spent my first session prompting "build me a landing page for X" and got the most generic ai-looking output you can imagine. then i actually uploaded some brand stuff, let it extract tokens, approved them, and suddenly everything after that looked... like a real product? same prompts, totally different result. the docs say this but i skimmed past it like an idiot.

second thing. it eats tokens. like, a lot. it's on a separate weekly budget from regular claude chat and claude code which is nice in theory but if you're regenerating stuff over and over in chat you'll burn through it. the refine controls (inline comments, direct text edits, sliders) use way less than re-prompting. once i started using those for small fixes instead of typing "actually can you make the padding bigger" in chat, my budget lasted way longer. i'm on max 20x and it's mostly fine, on the $20 plan you'll feel it fast.

also re: animations. they're live react components running in the browser, not video files. You can download standalone html file and upload to claude2video it will generate mp4 video from that.

honest take on where it fits in the landscape since people always ask: it's not killing figma. figma is still better for any real design team workflow, devmode, multi-person collab. v0 and lovable are still better if you want to skip design entirely and just spin up an mvp with auth and a db. where this thing wins is the loop from "i have an idea" to "working prototype" to "claude code builds the actual app from it". the design system carrying through to the shipped code is the part that's genuinely different.

if you're a solo founder or pm or someone who keeps getting stuck between figma mockups and a real thing you can show people, worth learning. if you have a design team and a real component library already, probably overkill.

it's a research preview btw so half of this might be wrong in two months.

https://preview.redd.it/b3akszqmkf1h1.jpg?width=1024&format=pjpg&auto=webp&s=4350693066097ba111773b3be66e484839aa4a53

reddit.com
u/SilverConsistent9222 — 4 days ago

some things i learned the hard way using claude design

been using claude design for a few weeks now and figured i'd dump some notes here before i forget. nothing groundbreaking, just stuff that took me way too long to figure out on my own.

first thing nobody tells you: do the design system setup BEFORE you build anything. i spent my first session prompting "build me a landing page for X" and got the most generic ai-looking output you can imagine. then i actually uploaded some brand stuff, let it extract tokens, approved them, and suddenly everything after that looked... like a real product? same prompts, totally different result. the docs say this but i skimmed past it like an idiot.

second thing. it eats tokens. like, a lot. it's on a separate weekly budget from regular claude chat and claude code which is nice in theory but if you're regenerating stuff over and over in chat you'll burn through it. the refine controls (inline comments, direct text edits, sliders) use way less than re-prompting. once i started using those for small fixes instead of typing "actually can you make the padding bigger" in chat, my budget lasted way longer. i'm on max 20x and it's mostly fine, on the $20 plan you'll feel it fast.

also re: animations. they're live react components running in the browser, not video files. You can download standalone html file and upload to claude2video it will generate mp4 video from that.

honest take on where it fits in the landscape since people always ask: it's not killing figma. figma is still better for any real design team workflow, devmode, multi-person collab. v0 and lovable are still better if you want to skip design entirely and just spin up an mvp with auth and a db. where this thing wins is the loop from "i have an idea" to "working prototype" to "claude code builds the actual app from it". the design system carrying through to the shipped code is the part that's genuinely different.

if you're a solo founder or pm or someone who keeps getting stuck between figma mockups and a real thing you can show people, worth learning. if you have a design team and a real component library already, probably overkill.

it's a research preview btw so half of this might be wrong in two months.

https://preview.redd.it/rqncnx9fkf1h1.jpg?width=1024&format=pjpg&auto=webp&s=386a7f2d9cc3b989080122eaba6bf99995970540

reddit.com
u/SilverConsistent9222 — 4 days ago

some things i learned the hard way using claude design

been using claude design for a few weeks now and figured i'd dump some notes here before i forget. nothing groundbreaking, just stuff that took me way too long to figure out on my own.

first thing nobody tells you: do the design system setup BEFORE you build anything. i spent my first session prompting "build me a landing page for X" and got the most generic ai-looking output you can imagine. then i actually uploaded some brand stuff, let it extract tokens, approved them, and suddenly everything after that looked... like a real product? same prompts, totally different result. the docs say this but i skimmed past it like an idiot.

second thing. it eats tokens. like, a lot. it's on a separate weekly budget from regular claude chat and claude code which is nice in theory but if you're regenerating stuff over and over in chat you'll burn through it. the refine controls (inline comments, direct text edits, sliders) use way less than re-prompting. once i started using those for small fixes instead of typing "actually can you make the padding bigger" in chat, my budget lasted way longer. i'm on max 20x and it's mostly fine, on the $20 plan you'll feel it fast.

also re: animations. they're live react components running in the browser, not video files. You can download standalone html file and upload to claude2video it will generate mp4 video from that.

honest take on where it fits in the landscape since people always ask: it's not killing figma. figma is still better for any real design team workflow, devmode, multi-person collab. v0 and lovable are still better if you want to skip design entirely and just spin up an mvp with auth and a db. where this thing wins is the loop from "i have an idea" to "working prototype" to "claude code builds the actual app from it". the design system carrying through to the shipped code is the part that's genuinely different.

if you're a solo founder or pm or someone who keeps getting stuck between figma mockups and a real thing you can show people, worth learning. if you have a design team and a real component library already, probably overkill.

it's a research preview btw so half of this might be wrong in two months.

https://preview.redd.it/b310ow1zjf1h1.jpg?width=1024&format=pjpg&auto=webp&s=e2c4a8c59ddce3d7948c5d4654c2cfa292871551

reddit.com
u/SilverConsistent9222 — 4 days ago

some things i learned the hard way using claude design

been using claude design for a few weeks now and figured i'd dump some notes here before i forget. nothing groundbreaking, just stuff that took me way too long to figure out on my own.

first thing nobody tells you: do the design system setup BEFORE you build anything. i spent my first session prompting "build me a landing page for X" and got the most generic ai-looking output you can imagine. then i actually uploaded some brand stuff, let it extract tokens, approved them, and suddenly everything after that looked... like a real product? same prompts, totally different result. the docs say this but i skimmed past it like an idiot.

second thing. it eats tokens. like, a lot. it's on a separate weekly budget from regular claude chat and claude code which is nice in theory but if you're regenerating stuff over and over in chat you'll burn through it. the refine controls (inline comments, direct text edits, sliders) use way less than re-prompting. once i started using those for small fixes instead of typing "actually can you make the padding bigger" in chat, my budget lasted way longer. i'm on max 20x and it's mostly fine, on the $20 plan you'll feel it fast.

also re: animations. they're live react components running in the browser, not video files. You can download standalone html file and upload to claude2video it will generate mp4 video from that.

honest take on where it fits in the landscape since people always ask: it's not killing figma. figma is still better for any real design team workflow, devmode, multi-person collab. v0 and lovable are still better if you want to skip design entirely and just spin up an mvp with auth and a db. where this thing wins is the loop from "i have an idea" to "working prototype" to "claude code builds the actual app from it". the design system carrying through to the shipped code is the part that's genuinely different.

if you're a solo founder or pm or someone who keeps getting stuck between figma mockups and a real thing you can show people, worth learning. if you have a design team and a real component library already, probably overkill.

it's a research preview btw so half of this might be wrong in two months.

https://preview.redd.it/5udhsgvbk81h1.png?width=1024&format=png&auto=webp&s=4d1fd828c1e747d529849b06f153370388ed87d0

reddit.com
u/SilverConsistent9222 — 5 days ago

some things i learned the hard way using claude design

been using claude design for a few weeks now and figured i'd dump some notes here before i forget. nothing groundbreaking, just stuff that took me way too long to figure out on my own.

first thing nobody tells you: do the design system setup BEFORE you build anything. i spent my first session prompting "build me a landing page for X" and got the most generic ai-looking output you can imagine. then i actually uploaded some brand stuff, let it extract tokens, approved them, and suddenly everything after that looked... like a real product? same prompts, totally different result. the docs say this but i skimmed past it like an idiot.

second thing. it eats tokens. like, a lot. it's on a separate weekly budget from regular claude chat and claude code which is nice in theory but if you're regenerating stuff over and over in chat you'll burn through it. the refine controls (inline comments, direct text edits, sliders) use way less than re-prompting. once i started using those for small fixes instead of typing "actually can you make the padding bigger" in chat, my budget lasted way longer. i'm on max 20x and it's mostly fine, on the $20 plan you'll feel it fast.

also re: animations. they're live react components running in the browser, not video files. You can download standalone html file and upload to claude2video it will generate mp4 video from that.

honest take on where it fits in the landscape since people always ask: it's not killing figma. figma is still better for any real design team workflow, devmode, multi-person collab. v0 and lovable are still better if you want to skip design entirely and just spin up an mvp with auth and a db. where this thing wins is the loop from "i have an idea" to "working prototype" to "claude code builds the actual app from it". the design system carrying through to the shipped code is the part that's genuinely different.

if you're a solo founder or pm or someone who keeps getting stuck between figma mockups and a real thing you can show people, worth learning. if you have a design team and a real component library already, probably overkill.

it's a research preview btw so half of this might be wrong in two months.

https://preview.redd.it/3nfcdph5k81h1.png?width=1024&format=png&auto=webp&s=cec9e8b0dc31c52f0eebebdf3eb9922c66d2a15e

reddit.com
u/SilverConsistent9222 — 5 days ago

some things i learned the hard way using claude design

been using claude design for a few weeks now and figured i'd dump some notes here before i forget. nothing groundbreaking, just stuff that took me way too long to figure out on my own.

first thing nobody tells you: do the design system setup BEFORE you build anything. i spent my first session prompting "build me a landing page for X" and got the most generic ai-looking output you can imagine. then i actually uploaded some brand stuff, let it extract tokens, approved them, and suddenly everything after that looked... like a real product? same prompts, totally different result. the docs say this but i skimmed past it like an idiot.

second thing. it eats tokens. like, a lot. it's on a separate weekly budget from regular claude chat and claude code which is nice in theory but if you're regenerating stuff over and over in chat you'll burn through it. the refine controls (inline comments, direct text edits, sliders) use way less than re-prompting. once i started using those for small fixes instead of typing "actually can you make the padding bigger" in chat, my budget lasted way longer. i'm on max 20x and it's mostly fine, on the $20 plan you'll feel it fast.

also re: animations. they're live react components running in the browser, not video files. You can download standalone html file and upload to claude2video it will generate mp4 video from that.

honest take on where it fits in the landscape since people always ask: it's not killing figma. figma is still better for any real design team workflow, devmode, multi-person collab. v0 and lovable are still better if you want to skip design entirely and just spin up an mvp with auth and a db. where this thing wins is the loop from "i have an idea" to "working prototype" to "claude code builds the actual app from it". the design system carrying through to the shipped code is the part that's genuinely different.

if you're a solo founder or pm or someone who keeps getting stuck between figma mockups and a real thing you can show people, worth learning. if you have a design team and a real component library already, probably overkill.

it's a research preview btw so half of this might be wrong in two months.

https://preview.redd.it/rkvvj0y9j81h1.png?width=1024&format=png&auto=webp&s=81c031133491613b8c5418a681094cf256ffc3b2

reddit.com
u/SilverConsistent9222 — 5 days ago

A simple breakdown of Claude Cowork vs Chat vs Code (with practical examples)

I came across this visual that explains Claude’s Cowork mode in a very compact way, so I thought I’d share it along with some practical context.

A lot of people still think all AI tools are just “chatbots.” Cowork mode is slightly different.

It works inside a folder you choose on your computer. Instead of answering questions, it performs file-level tasks.

In my walkthrough, I demonstrated three types of use cases that match what this image shows:

  • Organizing a messy folder (grouping and renaming files without deleting anything)
  • Extracting structured data from screenshots into a spreadsheet
  • Combining scattered notes into one structured document

The important distinction, which the image also highlights, is:

Chat → conversation
Cowork → task execution inside a folder
Code → deeper engineering-level control

Cowork isn’t for brainstorming or creative writing. It’s more for repetitive computer work that you already know how to do manually, but don’t want to spend time on.

That said, there are limitations:

  • It can modify files, so vague instructions are risky
  • You should start with test folders
  • You still need to review outputs carefully
  • For production-grade automation, writing proper scripts is more reliable

I don’t see this as a replacement for coding. I see it as a middle layer between casual chat and full engineering workflows.

If you work with a lot of documents, screenshots, PDFs, or messy folders, it’s interesting to experiment with. If your work is already heavily scripted, it may not change much.

Curious how others here are thinking about AI tools that directly operate on local files. Useful productivity layer, or something you’d avoid for now?

I’ll put the detailed walkthrough in the comments for anyone who wants to see the step-by-step demo.

https://preview.redd.it/800gve97511h1.jpg?width=800&format=pjpg&auto=webp&s=ae7832d5eef929faef15ebb996297a5ec30425d2

reddit.com
u/SilverConsistent9222 — 6 days ago

Most RAG apps in production are confidently wrong and nobody talks about this enough

Been working with a few teams integrating RAG into internal tools, support bots, document Q&A, contract search, and I keep running into the same thing nobody warns you about when you're following tutorials.

The basic retrieve-then-generate pipeline looks fine in demos. Clean question, clean doc, clean answer. Then real users show up.

The failure mode that gets me is this: the system pulls chunks from different versions of the same policy document, has no way to know they're from different versions, blends them together, and returns an answer with full confidence. No caveat, no "I'm not sure," nothing. Just fluent and wrong.

The deeper issue is that standard RAG has no mechanism for uncertainty. It retrieves, it generates, it moves on, same confidence level whether it nailed it or completely fabricated something plausible.

What actually fixes this (at least in the systems I've worked on) isn't swapping out the model. It's the architecture:

A routing layer — decide if retrieval is even necessary before making the call. Some questions don't need it and you're wasting tokens.

Retrieval scoring — evaluate what came back before passing it to the model. If the context scores low, reformulate the query and try again instead of just generating garbage confidently.

A hallucination check — second LLM call that reads both the generated answer and the retrieved docs and checks if every claim is actually traceable. Most teams aren't doing this and it's probably the highest ROI addition you can make.

The retry loop especially helped in our case because users never phrase questions the way your embedding model expects. The system silently reformulates and retries, user has no idea it happened.

None of this is exotic. It's just a few extra decision points in the pipeline. But if you're running plain RAG in production and wondering why users are losing trust in it, this is almost certainly why.

Curious if anyone else has run into the versioning/context blending issue specifically, that one seems underreported.

reddit.com
u/SilverConsistent9222 — 6 days ago

Most RAG apps in production are confidently wrong and nobody talks about this enough

Been working with a few teams integrating RAG into internal tools, support bots, document Q&A, contract search, and I keep running into the same thing nobody warns you about when you're following tutorials.

The basic retrieve-then-generate pipeline looks fine in demos. Clean question, clean doc, clean answer. Then real users show up.

The failure mode that gets me is this: the system pulls chunks from different versions of the same policy document, has no way to know they're from different versions, blends them together, and returns an answer with full confidence. No caveat, no "I'm not sure," nothing. Just fluent and wrong.

The deeper issue is that standard RAG has no mechanism for uncertainty. It retrieves, it generates, it moves on, same confidence level whether it nailed it or completely fabricated something plausible.

What actually fixes this (at least in the systems I've worked on) isn't swapping out the model. It's the architecture:

A routing layer — decide if retrieval is even necessary before making the call. Some questions don't need it and you're wasting tokens.

Retrieval scoring — evaluate what came back before passing it to the model. If the context scores low, reformulate the query and try again instead of just generating garbage confidently.

A hallucination check — second LLM call that reads both the generated answer and the retrieved docs and checks if every claim is actually traceable. Most teams aren't doing this and it's probably the highest ROI addition you can make.

The retry loop especially helped in our case because users never phrase questions the way your embedding model expects. The system silently reformulates and retries, user has no idea it happened.

None of this is exotic. It's just a few extra decision points in the pipeline. But if you're running plain RAG in production and wondering why users are losing trust in it, this is almost certainly why.

Curious if anyone else has run into the versioning/context blending issue specifically, that one seems underreported.

reddit.com
u/SilverConsistent9222 — 7 days ago

Most RAG apps in production are confidently wrong and nobody talks about this enough

Been working with a few teams integrating RAG into internal tools, support bots, document Q&A, contract search, and I keep running into the same thing nobody warns you about when you're following tutorials.

The basic retrieve-then-generate pipeline looks fine in demos. Clean question, clean doc, clean answer. Then real users show up.

The failure mode that gets me is this: the system pulls chunks from different versions of the same policy document, has no way to know they're from different versions, blends them together, and returns an answer with full confidence. No caveat, no "I'm not sure," nothing. Just fluent and wrong.

The deeper issue is that standard RAG has no mechanism for uncertainty. It retrieves, it generates, it moves on, same confidence level whether it nailed it or completely fabricated something plausible.

What actually fixes this (at least in the systems I've worked on) isn't swapping out the model. It's the architecture:

A routing layer — decide if retrieval is even necessary before making the call. Some questions don't need it and you're wasting tokens.

Retrieval scoring — evaluate what came back before passing it to the model. If the context scores low, reformulate the query and try again instead of just generating garbage confidently.

A hallucination check — second LLM call that reads both the generated answer and the retrieved docs and checks if every claim is actually traceable. Most teams aren't doing this and it's probably the highest ROI addition you can make.

The retry loop especially helped in our case because users never phrase questions the way your embedding model expects. The system silently reformulates and retries, user has no idea it happened.

None of this is exotic. It's just a few extra decision points in the pipeline. But if you're running plain RAG in production and wondering why users are losing trust in it, this is almost certainly why.

Curious if anyone else has run into the versioning/context blending issue specifically, that one seems underreported.

reddit.com
u/SilverConsistent9222 — 7 days ago