u/frank_brsrk

ClaudeAI modbot was taking too seriously competitor mentioning. Even called "gemini is a dogshit compared to claude" , chill out brother

u/frank_brsrk — 19 hours ago
▲ 4 r/cursor

Python tool your Cursor agent can run to A/B test prompts against a blind judge

u/frank_brsrk — 20 hours ago

Stop guessing if your prompt changes are lifting your agent. Run a blind A/B with a third-party judge.

u/frank_brsrk — 20 hours ago
▲ 4 r/n8n_ai_agents+2 crossposts

Eval workflow for agentic builders: fork any prompt through baseline vs scaffolded agents, blind third-party judge.

u/frank_brsrk — 2 days ago

pre-context injection changes what opus 4.7 notices without changing what it can do. arithmetic catch on the second pass. watch how a harness through text injection increases llm performance.

u/frank_brsrk — 4 days ago