u/FewConcentrate7283

Title Idea: How I used Claude Code + Subagent-Driven Development to ship 2 ML research notebooks in 48 hours

The Project

I’m building the research arm of Parley—AR glasses for real-time two-way conversation between hearing and deaf users. The research question: How much does hand-shape alone carry the signal for isolated-sign recognition vs. temporal information?

The interesting part for this sub isn't the ASL research—it's the workflow. Claude Code did ~95% of the implementation with me acting as architect and reviewer.

The Workflow: Subagent-Driven Development

I used the pattern fromobra/superpowers:

  1. Detailed Implementation Plan: A ~2000-line markdown file with tasks broken into bite-sized steps including exact code snippets.
  2. Fresh Subagents: I dispatched one fresh subagent per task. No session inheritance—every task starts with a clean slate.
  3. Two-Stage Review: * Spec-compliance subagent verifies the diff against the plan.
    • Code-quality subagent runs a second pass for best practices.
  4. Parallel Execution: I ran 30 tasks across ~22 dispatches, batching 3 at a time where safe.

Model Selection

  • Haiku: Mechanical code (scaffolding, simple functions, test files).
  • Sonnet: Implementations requiring judgment (architecture, bug fixes) and final-pass reviews.

3 Bugs the Review Loop Caught (That I Would've Missed)

  1. The MediaPipe Trap: My hand_feature_vector function was silently dropping the right hand. It assumed hand landmarks were contiguous, but MediaPipe places Pose (33 landmarks) between Left and Right hands. A subagent flagged that the slice was grabbing pose data instead of the right hand before I wasted hours on training.
  2. The Early-Stop Crash: aggregate_over_seeds() crashed on non-numeric keys ("early_stop") after 2 hours of training. A subagent wrote a standalone recovery script to re-aggregate from on-disk artifacts, saving a 3-hour retrain.
  3. Non-Deterministic Kaggle Paths: Different notebooks mounted datasets at different nested levels. After five failed pushes, a subagent added diagnostic os.walk() logic to make path detection robust.

The Results (Shipped on Kaggle)

  • Notebook 00 — ISLR EDA: Proves that published ASL accuracy is often inflated by identity leakage. Honest signer-holdout accuracy is ~half of what's usually reported.
  • Notebook 01 — Hand-Shape Baseline: MLP (31.5%) vs. Temporal 1D-Conv (36.4%). The 4.9 pp gap confirms that hand-shape priors dominate for isolated signs.

Lessons Learned

  • What Worked: Fresh context prevents "hallucination drift." Plans written like spec docs (not TODOs) mean subagents don't have to "invent" logic.
  • What I'd Change: I was too granular on notebook sections—one subagent could have handled 10 boilerplate cells. I also need a visual dashboard; tracking 30 tasks via TodoWrite got chaotic.

The "Why"

Current ASL AI claims ~83% accuracy, but honest evaluation shows ~36%. That 47-point gap is what happens when these products hit the real world. My goal is to publish the honest numbers to build a foundation for Phase 4: a custom, deaf-community co-designed dataset.

Happy to answer questions about the Claude Code workflow, subagent prompts, or the ML side!

reddit.com
u/FewConcentrate7283 — 9 hours ago
▲ 0 r/computervision+1 crossposts

Title Idea: How I used Claude Code + Subagent-Driven Development to ship 2 ML research notebooks in 48 hours

The Project

I’m building the research arm of Parley—AR glasses for real-time two-way conversation between hearing and deaf users. The research question: How much does hand-shape alone carry the signal for isolated-sign recognition vs. temporal information?

The interesting part for this sub isn't the ASL research—it's the workflow. Claude Code did ~95% of the implementation with me acting as architect and reviewer.

The Workflow: Subagent-Driven Development

I used the pattern fromobra/superpowers:

  1. Detailed Implementation Plan: A ~2000-line markdown file with tasks broken into bite-sized steps including exact code snippets.
  2. Fresh Subagents: I dispatched one fresh subagent per task. No session inheritance—every task starts with a clean slate.
  3. Two-Stage Review: * Spec-compliance subagent verifies the diff against the plan.
    • Code-quality subagent runs a second pass for best practices.
  4. Parallel Execution: I ran 30 tasks across ~22 dispatches, batching 3 at a time where safe.

Model Selection

  • Haiku: Mechanical code (scaffolding, simple functions, test files).
  • Sonnet: Implementations requiring judgment (architecture, bug fixes) and final-pass reviews.

3 Bugs the Review Loop Caught (That I Would've Missed)

  1. The MediaPipe Trap: My hand_feature_vector function was silently dropping the right hand. It assumed hand landmarks were contiguous, but MediaPipe places Pose (33 landmarks) between Left and Right hands. A subagent flagged that the slice was grabbing pose data instead of the right hand before I wasted hours on training.
  2. The Early-Stop Crash: aggregate_over_seeds() crashed on non-numeric keys ("early_stop") after 2 hours of training. A subagent wrote a standalone recovery script to re-aggregate from on-disk artifacts, saving a 3-hour retrain.
  3. Non-Deterministic Kaggle Paths: Different notebooks mounted datasets at different nested levels. After five failed pushes, a subagent added diagnostic os.walk() logic to make path detection robust.

The Results (Shipped on Kaggle)

  • Notebook 00 — ISLR EDA: Proves that published ASL accuracy is often inflated by identity leakage. Honest signer-holdout accuracy is ~half of what's usually reported.
  • Notebook 01 — Hand-Shape Baseline: MLP (31.5%) vs. Temporal 1D-Conv (36.4%). The 4.9 pp gap confirms that hand-shape priors dominate for isolated signs.

Lessons Learned

  • What Worked: Fresh context prevents "hallucination drift." Plans written like spec docs (not TODOs) mean subagents don't have to "invent" logic.
  • What I'd Change: I was too granular on notebook sections—one subagent could have handled 10 boilerplate cells. I also need a visual dashboard; tracking 30 tasks via TodoWrite got chaotic.

The "Why"

Current ASL AI claims ~83% accuracy, but honest evaluation shows ~36%. That 47-point gap is what happens when these products hit the real world. My goal is to publish the honest numbers to build a foundation for Phase 4: a custom, deaf-community co-designed dataset.

Happy to answer questions about the Claude Code workflow, subagent prompts, or the ML side!

reddit.com
u/FewConcentrate7283 — 8 hours ago
▲ 4 r/kaggle

EDA of Google's ISLR dataset — why the Kaggle-winning ~83% accuracy number hides signer leakage

I’ve been writing a slow-release research arc on ASL recognition, and before any modeling, I wanted to actually look at Google’s Isolated Sign Language Recognition dataset the way it should’ve been looked at before every Kaggle winner reported 83% accuracy on it.

Notebook 00 of a nine-phase project: What does the Google ASL Signs data actually look like?

https://www.kaggle.com/code/truepathventures/parley-notebook-00-islr-eda

The sharp opinion, drawn from the EDA itself:

The Kaggle-default random 80/10/10 split — which every public winning solution used — puts the same signer’s clips in train, val, and test. That’s measuring how well the model memorizes each signer’s specific missing-landmark pattern, not how well it generalizes. Three numerical reasons:

  1. Missing-landmark patterns are structural per-sign, not random. The sign × landmark-type heatmap shows clear one-hand-missing signatures for bilateral-handshape signs and face-adjacent signs. Fork the notebook and scroll to §3.

  2. Median clip length varies 2×+ across the 21 signers. Fixed-length padding normalizes away signer-specific timing the model won’t see at inference.

  3. Per-signer coverage of signs is high but not uniform. Leave-one-signer-out evaluation is feasible — the coverage histogram in §6 is how we know.

Recommended split: signer-holdout — 17 train / 2 val / 2 test. Notebook 01 (next month) quantifies the accuracy gap against random-split, with error bars across 3+ seeds.

This is notebook 1 of 9. Not a competition entry — a slow-release research project. Feedback welcome, especially from anyone who’s worked with ISLR before or runs signer-holdout evaluation in their own sign-language ML work.

reddit.com
u/FewConcentrate7283 — 1 day ago

Hardware + Computer Vision: Is a patent actually worth it in the AI era?

Hey all — looking for real talk from people who've actually been through this.

I'm building a hardware + computer vision product (sports tech, AR scoring system). It's novel enough that my team thinks we should patent it. I'm staring down a provisional filing and having second thoughts.

The case FOR filing:

  • First-to-file: US is first-to-file, so waiting = risk.
  • Public Demos: We're about to demo publicly (beta venue + investor pitches), which starts the 1-year disclosure clock.
  • Investors: They supposedly care about IP moats.

The case AGAINST (what's nagging me):

  • AI Speed: AI is accelerating everything. By the time a patent grants in 3-5 years, will the tech even look the same?
  • Enforcement: Enforcing a patent as a small company against a well-funded competitor sounds like a nightmare.
  • Invalidation: A lot of "software patents" get invalidated anyway.
  • Trade Secrets: Speed to market might matter more than paper rights.
  • Cost: $60 for a provisional is cheap, but the non-provisional is $10k+ and that's real money pre-revenue.

Questions for you:

  1. If you've filed — did it actually protect you, or was it mostly theater for investors?
  2. Did any investor actually dig into your patent claims, or just check the "patent pending" box?
  3. Anyone regret NOT filing? Got cloned and couldn't do anything about it?
  4. For CV / AI-heavy inventions specifically — is the patent worth the public disclosure, or are you better off keeping it a trade secret and moving fast?
  5. DIY provisional vs. attorney-drafted — did it matter in the end?

Not looking for lawyers to pitch me (please). Just want to hear from builders who've been through it.

Thanks 🙏

reddit.com
u/FewConcentrate7283 — 4 days ago