u/Ok_Today5649

Six months ago every piece of content our agents produced at whaaat ai sounded like it came from a polite, slightly enthusiastic copywriter - but the tone was always similar. Technically correct but missing personality. We could swap our brand name for any competitor and nobody would notice the difference.

The fix took about two hours per person and now runs as a standard service we offer through our agency. Full disclosure: I work on the AI agent team at whaaat ai.

The process

You sit down with Claude (Opus works best for this, extended thinking on) and paste a single prompt that turns it into what I call a "Taste Interviewer." 100 questions across seven categories: core beliefs, writing mechanics, aesthetic crimes (things that make you physically cringe in other people's writing), voice and personality, structural preferences, hard nos and red flags.

The interviewer prompt has rules that matter. One question at a time. It pushes back on vague answers. If you say "I like to keep it simple," Claude will ask what "simple" means to you specifically, with examples of simple done well and simple done lazily. It flags contradictions from earlier answers. It follows interesting threads instead of marching through categories in order.

I dictate my answers instead of typing. Dictation is faster and more honest because you think less before responding. The whole thing takes 90 minutes dictated, closer to two hours typed.

What comes out is a raw document, 15,000 to 20,000 words. Your complete voice, unedited. Some of the questions do indeed feel more like a coaching session than a content exercise - so we warm people upfront. That part caught me off guard the first time - lol.

Compression

The final raw interview is way too large to use as context. 20,000 words loaded into every conversation burns tokens fast and costs real money if you're running this across multiple daily sessions.

So the important second prompt is a "Voice Compiler" that compresses the raw interview into a structured about-me.md file. The target is 2,000 to 4,000 tokens with a hard ceiling at 5,000. The compiler uses a single test for every line: "If this line disappeared, would the AI write, edit, judge or decide differently?" If yes, keep it. If no, cut it.

The output uses XML-style sections: identity context, voice fingerprint, writing laws, hard refusals, taste loves, taste disgusts, phrase bank, signature tells, decision rules and productive contradictions. Plus 3 to 6 examples in bad/good format that teach the AI your patterns.

The key distinction: compression is different from summarisation. Summarising loses nuance. Compressing keeps everything that changes AI behaviour and strips everything that just sounds nice about you.

What changed

Before the voice file, our content agents produced output that needed 30 to 40 minutes of editing per piece to sound like the person it was supposed to come from. After embedding the compressed file as standing context, editing dropped to under 10 minutes. Some pieces can be published without any editing.

The part I keep tweaking: the voice file drifts. Your opinions shift, your style evolves and new pet peeves develop. A file from six months ago makes the AI sound like you six months ago. So we've setup a monthly review now: 10 minutes, just reading through and updating what changed. Unfurtunately, we still haven't found a clean way to automate that review.

For anyone running Claude specifically: drop the about-me.md into your Cowork folder and it loads automatically in every session. You can also wrap it in a Skill that applies the voice to every writing task without manual setup. Both approaches work, the Skill route gives you more control over when the voice applies and when it stays quiet.

The full interview prompt and compiler prompt are both about 400 words each. Happy to share the German versions if anyone wants them (we built the original process in German, the English translation works identically). The prompts are the easy part. The hard part is answering 100 questions about yourself without defaulting to the version of yourself you think sounds good.

reddit.com
u/Ok_Today5649 — 7 days ago

Y Combinator dropped their "Requests for Startups" list last week. Sixteen ideas they want to fund this summer. Most coverage focuses on the flashy stuff: drone defense, space manufacturing, semiconductor supply chains.

Three entries buried in the list caught my attention because they describe the architecture we already run in my company. Not as theory. As our daily operating setup. Full disclosure: I work on the AI agent team at whaaat ai, and seeing YC validate the patterns we've been building on for months was a weird kind of relief.

The three that matter

"AI-Native Service Companies" by Gustaf Alströmer argues that the next wave of companies won't sell AI tools. They'll deliver entire service categories powered by AI. His reasoning: the global services market ($6T+) dwarfs the software market ($600B) by a factor of ten. Every service that runs on repetitive, rule-based processes is a target. Insurance brokerage, accounting, compliance, healthcare admin.

We've been running this model for content production. Our AI marketing agents handle the full pipeline: brand voice extraction, multi-channel content generation, distribution. The shift from "use our tool" to "we deliver the output" changed our economics completely. We switched our entire infrastructure from GPT to Claude about six months ago because Claude Code, Cowork and MCP gave us process control that wasn't possible before. Revenue went up. Not because Claude is "better at writing" in some abstract sense, but because Skills and MCP integrations let us build repeatable pipelines that run without manual intervention on every step.

"Software for Agents" by Aaron Epstein is the one I keep coming back to. His central claim: "The next wave of internet users will be AI agents, not humans." He calls it "Making Something Agents Want," riffing on YC's classic motto. Current software is built for human eyes and human fingers. Forms, dashboards, cookie banners, captchas. For an agent trying to use that software programmatically, every one of those is a wall.

This is why MCP matters so much. Our five-agent stack (builder, operator, cockpit, researcher, marketing) communicates entirely through MCP. None of them click buttons. They talk to Gmail, Linear, Todoist, Stripe and GitHub through standardized interfaces. When we set up a new integration, it takes one MCP config and every agent that needs it can use it immediately.

"Dynamic Software Interfaces" by Ankit Gupta describes users building their own interfaces with AI coding agents. He asks whether developers will need to ship source code instead of packaged binaries so users can modify things on the fly. This already exists in Claude Cowork as Live Artifacts. I built a morning dashboard in about two minutes: unanswered emails on the left, prioritized tasks on the right, calendar at the top. Persistent, pulls live data on every open. The token cost argument is interesting here too. A dashboard built once costs nearly nothing to maintain. An agent fetching and formatting the same data daily burns tokens every single day.

What I think this means

The pattern across all three: AI moves from feature to infrastructure. From "add AI to your product" to "rebuild the product for a world where agents are the primary users." YC is describing what's already happening at companies that build this way.

The part I'm still figuring out: agent-to-agent communication. Our five agents work well in isolation, each owning its domain. But the handoff between the research agent finding a content opportunity and the marketing agent acting on it is still clunky. We pipe it through a shared context file right now. If anyone has built cleaner agent orchestration across different tool stacks, I'd genuinely like to hear how you structured it.

reddit.com
u/Ok_Today5649 — 10 days ago

Y Combinator published their Summer 2026 "Requests for Startups" list last week. Sixteen ideas they want to fund. One entry by Aaron Epstein has a line that I think frames the next five years of software: "The next wave of internet users will be AI agents, not humans."

I work on the AI agent team at whaaat ai, and this matched something we've been running into constantly. Every time we connect a new tool to our agent stack, the bottleneck is the same: software that was designed for someone looking at a screen and clicking buttons.

The wall agents hit

Think about what happens when you try to automate something in your business today. Your CRM has a beautiful dashboard with drag-and-drop pipelines. Your project management tool has kanban boards and color-coded labels. Your accounting software has dropdown menus and multi-step wizards. For a human, all of that is helpful. For an AI agent trying to move a lead from one stage to another, create a task or categorize an expense, every single one of those visual elements is irrelevant at best and an obstacle at worst. The agent needs an API endpoint, a documented data schema and predictable responses. Cookie banners, captchas, session timeouts and confirmation dialogs are walls. Epstein calls this "Making Something Agents Want," a riff on YC's classic "Make Something People Want." The argument: most software today works poorly for agents because nobody built it with agents in mind.

Where the opportunity sits

This is where it gets interesting for anyone building or thinking about building a business. The global software market is massive. CRMs, HR platforms, accounting tools, compliance systems, invoicing, scheduling, inventory management. Every single category was built for human users. Every single one now needs a version (or a layer) that agents can consume natively. The companies that bolt on agent support as an afterthought will struggle. An API added to a product designed around a visual workflow always feels like a translation layer. Clunky, incomplete, constantly breaking when the UI team ships changes that nobody told the API team about. The real opportunity is building agent-first from day one, where the API is the product and the human dashboard is one optional interface on top.

We run five agents across our operation. Builder handles code, Operator manages automations, Cockpit is the monitoring layer, Researcher does market and competitor scanning, Marketing handles content. All five communicate through MCP, an open protocol that standardizes how agents talk to tools. When we add a new integration, one config file and every agent that needs access has it immediately.

MCP is gaining traction specifically because of this problem. Instead of building custom integrations for every agent-tool combination, you build one MCP server and any agent that speaks the protocol can use it. For context, we connect to Gmail, Linear, Todoist, Stripe and GitHub this way. Setting up a new connection used to take a developer a day of custom API work. Now it takes a config file and 15 minutes.

Two paths I see

If you already run a software product: look at whether your product has an API that agents can use end-to-end without a human in the loop. Not "we have an API" in the marketing sense, but genuinely: can an agent complete a full workflow through your API alone? If the answer involves "and then the user clicks confirm in the UI," there's a gap.

If you're exploring what to build: pick any software category where the current tools are optimized for human interaction and build the agent-native version. Accounting that agents can query and categorize through. Project management that agents can update without navigating boards. CRM pipelines that agents can move deals through based on rules, not drag-and-drop. The catch I keep coming back to: agent-first software still needs to be inspectable by humans. The agents do the work, but a founder or operator needs to see what happened, catch mistakes and adjust rules. Building that inspection layer without falling back into "just build a dashboard" is the design challenge I haven't seen anyone solve cleanly yet. Our current approach is Live Artifacts in Cowork where I describe what I want to see and Claude builds it on the fly, but that only works for the person asking. If anyone has built a good pattern for multi-user visibility into agent-operated systems, I'd like to hear about it.

reddit.com
u/Ok_Today5649 — 10 days ago

Six weeks ago I set up a system where five AI agents handle five different jobs across my entire workflow: engineering, back-office ops, information dashboards, research and marketing. No employees, no freelancers, no SaaS subscriptions stacked on top of each other.

Here's the setup.

The 5-Agent System

Each agent sits on a different layer of the stack. They don't compete with each other, they complement.

Agent 1: Builder (Claude Code). Writes code, refactors, ships features. The key is the workspace setup: a CLAUDE.md file that teaches the agent your architecture rules, Skills (markdown files that define repeatable workflows) and MCP integrations that connect it to GitHub, Postgres, Slack and whatever else you use. Without that setup, Claude Code is autocomplete. With it, it's your first engineering hire.

Agent 2: Operator (Claude Code Pipelines + n8n). Runs five pipelines that would normally require five people: video repurposing (YouTube link in, 10 platform-specific posts out), lead enrichment (raw company list in, scored leads with personalized openers out), competitive intelligence (weekly URL scans with change detection), invoice extraction (PDFs to structured data at 94-97% accuracy) and a knowledge base agent that turns support tickets into documentation.

Agent 3: Cockpit (Live Artifacts in Cowork). This one surprised me the most. It's technically not an agent in the traditional sense. It's a persistent HTML dashboard that pulls live data from your connectors every time you open it. Gmail, calendar, task manager, all on one screen. The reason I went with a dashboard instead of a daily reporting agent: tokens. An agent that fetches and processes data every morning costs real money. A dashboard built once that only pulls data on demand costs almost nothing after the initial build. Took me 2 minutes to set up, saves me 20 minutes every morning.

Agent 4: Researcher (Hermes / OpenClaw with Kimi 2.6). Hermes runs in the cloud on a $5 VPS with built-in cron jobs, parallel subagents and persistent memory. OpenClaw runs locally with access to your Obsidian vault, local PDFs and terminal output. The community trend is to stack both: Hermes for web research, OpenClaw for local context. Best model for research tasks right now is Kimi 2.6, easiest to run via an Ollama subscription or through OpenRouter.

Agent 5: Marketing (Higgsfield / whaaat ai). The gap everyone ignores in the solo founder narrative. You build the product, you run the pipelines, Stripe shows zero. Higgsfield closes this: drop in a link, pick an AI persona, get 500+ ad-ready video cuts per day. The research agent feeds insights in, the marketing agent produces creatives, performance data flows back into the next research cycle.

What I learned after 6 weeks

The system gets better every week because each agent feeds the next one. Research finds opportunities, the operator processes them, the cockpit shows me status, the builder ships what's missing, marketing distributes what's ready.

Starting point if you want to try this: build the cockpit first. It has the lowest setup effort and the fastest payoff. Connect your email, task tool and calendar, describe what you want, done in under 5 minutes.

Full disclosure: I work on the AI agent team at whaaat ai. Happy to share more detail on any of the five agents. What does your current AI workflow stack look like?

reddit.com
u/Ok_Today5649 — 14 days ago
▲ 6 r/whaaat_ai+1 crossposts

For transparency: I work on the AI agent team at whaaat ai. We build AI marketing agents, but this post is about something I built for myself that has nothing to do with marketing.

Every morning I used to spend 15-20 minutes before doing any real work. Open Gmail, scan for urgent messages. Open Todoist, check priorities. Open Google Calendar, see what meetings are coming. Open Slack, see if anything blew up overnight. Open Stripe, check if revenue moved. That's five apps, five logins, five context switches before my actual workday even started.

A few weeks ago I found a way to compress all of that into a single screen that takes 2 minutes to check. And this was built with no code, no deployment, no monthly subscription.

What I built

A persistent dashboard inside Claude's Cowork environment. It connects to your existing tools (email, task manager, calendar, payment processor) through API connectors and pulls live data every time you open it. One screen, everything visible at a glance.

The setup took under 5 minutes: connect the data sources, describe what you want in plain language, let Claude build the HTML dashboard. After that it's persistent. Opens instantly, refreshes on demand.

Why this works better than most "dashboard" solutions

Most dashboard tools (Notion dashboards, custom Retool builds, Geckoboard) require either ongoing maintenance or a monthly fee. This approach has two advantages: the dashboard is built once and costs nothing to maintain. It pulls from the same tools you already use without requiring any data migration.

The other thing I noticed: it changed how I start my day. Instead of bouncing between apps and losing focus, I open one thing, see everything, decide what matters and start working. It sounds small. Over a month it adds up to roughly 6-7 hours of reclaimed time.

What it actually shows

My setup has four sections: unanswered emails sorted by date on the left, tasks sorted by priority on the right, calendar timeline across the top and a small Stripe revenue number in the corner. You can customize this to whatever matters for your business.

The key insight that made me choose a static dashboard over a daily AI summary: running an AI agent every morning to compile a report burns through your API budget fast. A dashboard that was built once and only fetches data when you open it is essentially free after the initial build. Same information, fraction of the cost.

If you want to try this

You need Claude with Cowork access and the connectors for your tools (Gmail and Google Calendar are built in, others like Todoist have their own MCP integrations you can add in a minute). Describe your ideal morning dashboard in plain language. Claude builds it. Done.

The question I keep coming back to: how much time does your team spend every day on "checking things" before actually doing things? For us it was close to an hour per person when you add it all up. Now it's under 5 minutes.

Curious if anyone else has found good ways to reduce morning admin time.

Disclaimer: This dashboard setup is unrelated to whaaat ai's product.

reddit.com
u/Ok_Today5649 — 16 days ago
▲ 111 r/whaaat_ai

I tracked my token usage for a week after Opus 4.7 ate through my weekly quota in three days instead of five. That forced me to get serious about token management. After testing every technique I could find, here are the 8 that actually move the needle, sorted by impact.

First the math that makes this urgent: token costs grow quadratically with conversation length. Each message you send gets shipped with the entire conversation history. At 500 tokens per exchange on average, message 30 costs 31x more than message 1. One developer tracked his usage and found 98.5% of tokens went to re-reading history. Only 1.5% was actual output.

The biggest savings come from how you structure sessions, not from writing shorter prompts.

Hack 1: The Caveman Prompt (30-50% fewer output tokens)

The fastest single hack. Claude talks a lot by default. Preambles, recaps of your question, polite introductions and lengthy explanations. All of it costs tokens you do not need.

Drop this into your CLAUDE.md:

Reply in the most concise form possible. Skip pleasantries,

preambles, and recaps of my question. No phrases like

"I'd be happy to", "Great question", or "Let me explain".

Drop articles and filler words wherever the meaning stays clear.

Prefer short declarative sentences. If a tool call is needed,

run it first and show only the result. Do not narrate your steps.

In my testing this consistently saves 30 to 50% on conversational responses. Smaller effect on code output since code is already compact, but massive on explanations and analysis.

Hack 2: Edit instead of Follow-Up (exponential context savings)

When Claude misses your intent, the reflex is sending a correction. "No, I meant..." or "That's not what I wanted..." Each of those messages stacks onto the context. Claude re-reads the entire history every turn, including the failed attempts that gave you nothing.

The alternative: go back to your original message, edit it and regenerate. The old exchange gets replaced instead of stacked. In Claude Code the equivalent is starting a fresh session with clean context instead of running one session forever.

Hack 3: Effort Level management (up to 50% fewer tokens per task)

/effort in Claude Code sets the level per session:

Level When to use Token cost
high Routine work, known patterns Baseline
xhigh Daily driver, complex tasks ~1.5x
max Hardest architecture decisions ~2-3x

The mistake most people make: running xhigh as a permanent default because it "sounds better." For most tasks high is enough. Reserve xhigh and max for situations where you genuinely need the extra reasoning depth.

Hack 4: Compact Skill (context reset without loss)

Every 15 to 20 messages your context window becomes a problem. The solution is summarizing and starting fresh with a prompt optimized for a new Claude instance:

Summarize our entire conversation so I can paste it into a

new chat and continue without losing context. Include:

(1) the original goal or problem

(2) key decisions made and why

(3) any code, config, or data we settled on, verbatim,

in code blocks

(4) open questions and next steps

Use short sections with headings. Skip small talk and

exploratory tangents. Optimize the summary for a future

Claude reading it cold.

The key phrase is "for a future Claude reading it cold." The summary contains exactly the context a fresh instance needs to continue seamlessly. Machine-optimized handoffs, not human-readable summaries.

Hack 5: Code Review Graph (8x-49x fewer tokens on reviews)

When you ask Claude to review code it reads everything it can find. In a larger repo that means thousands of lines unrelated to your change.

Code Review Graph (github.com/tirth8205/code-review-graph) uses Tree-sitter to build a structural map of your code. Claude then reads only the files affected by your change. Benchmarks: roughly 8x fewer tokens on normal reviews, up to 49x on monorepos.

Hack 6: PDF Compression (70-80% fewer input tokens)

Image-heavy or scanned PDFs eat your context window. Run them through a cheaper model first with this prompt, then feed the compressed text to Opus:

Read this document end to end. Output a condensed plaintext

version that preserves:

(1) all factual claims, numbers, dates, and names

(2) every actionable instruction or recommendation

(3) the document's structure as short headings

Drop filler phrases, repeated context, marketing language,

formatting artifacts, and page headers/footers.

Target 20-30% of the original length.

Return only the condensed text, no commentary.

Double benefit: you save tokens on input and Opus processes the cleaner text better because there is less noise in the context.

Hack 7: Batch instead of Split (3x fewer context loads)

Three separate prompts load the context three times. One prompt with three tasks loads it once. Combine related requests into a single message. Bonus: results often improve because Claude sees the full picture and can align the outputs.

Hack 8: Avoid Peak Hours (same tokens, fewer interruptions)

Since March 2026 Anthropic drains your 5-hour rolling window faster during peak. Peak is 5:00 to 11:00 AM Pacific, which is 14:00 to 20:00 CET on weekdays. Your weekly limit stays the same but the distribution changed.

The sweet spot for European users: mornings before 14:00, evenings after 20:00 and weekends. Schedule resource-intensive tasks into those windows.

Start with hack 1 and 3. Copy the caveman prompt into your CLAUDE.md and set /effort high as default. Takes 60 seconds and changes how long your quota lasts immediately.

What is everyone running as their default effort level? And has anyone found other techniques that cut token usage significantly?

reddit.com
u/Ok_Today5649 — 22 days ago