u/Far_Pangolin_7657

AI agents feel slow because they work step-by-step — AgentWing makes them run in parallel

I’m building AgentWing to make AI agents feel near-instant.

Most agents today still work like one person doing everything step by step:

plan → act → wait → observe → act again → verify → finish

That creates a lot of waiting.

AgentWing adds a Director Agent that splits one task into multiple parallel workers.

Example:

Task: “Research a market and prepare a competitor summary.”

Instead of one agent doing everything sequentially:

- worker A finds competitors

- worker B analyzes pricing

- worker C checks positioning

- worker D gathers user complaints

- verifier combines the final result

For enterprise-level workflows, each worker can run inside its own isolated VM/sandbox with separate files, tools, permissions, and execution boundaries.

There are two modes:

**Split Mode**

Different workers handle different parts of one task.

**Race Mode**

Duplicate workers try the same subtask, and the best verified result wins.

The point is not “more agents for hype.”

The point is this:

AI agents feel slow because they execute too much work sequentially.

AgentWing makes them work like a coordinated team, so tasks can finish faster and feel close to instant when the work can be parallelized.

Would love thoughts on which agent workflows would benefit most:

research, coding, browser automation, customer support, operations, or something else?

reddit.com
u/Far_Pangolin_7657 — 10 hours ago

i looked at 15+ agent tools and the execution layer still feels broken

spent the last few days looking at agent tools, sandbox providers, browser-agent infra, MCP/connectors, and guardrail products.

just one question:

“what happens before an AI agent touches a real tool?”

not after it fails.

not in a dashboard later.

not during evals.

right before execution.

here’s what i found:

6 out of 15+ were mostly focused on giving agents a place to run — sandboxes, containers, browsers, cloud desktops, code execution environments.

4 were focused on connecting agents to tools — MCP, APIs, auth, app integrations, workflows.

3 were mostly about testing, evals, observability, or reviewing what happened after the run.

a few had approvals or guardrails, but usually tied to one specific workflow, tool, or environment.

the deeper problem:

agents are moving from “generate text” to “do actions”.

they can read files, run shell commands, call APIs, use browsers, send emails, touch databases, trigger workflows, edit repos, or prepare deploys.

but the control layer still feels messy.

a sandbox answers:

“where can this run safely?”

but it doesn’t fully answer:

“should this action run at all?”

that’s the part i’m building around with AgentWing.

the idea is not to build yet another sandbox.

there are already good ones: Cloudflare Sandboxes, E2B, Daytona, Modal, Browserbase, AWS-style runtimes, and internal company sandboxes.

AgentWing sits above them.

the mental model is something like 2-phase commit for AI agents:

  1. agent proposes an action

  2. control layer checks policy/risk

  3. risky actions route to sandbox, approval, or restore point

  4. structured feedback goes back to the agent

  5. every action gets an audit receipt

so instead of:

agent proposes → tool executes → hope nothing bad happens

it becomes:

agent proposes → AgentWing checks → sandbox/approve/block/restore → agent replans → receipt saved

the thing i’m trying to figure out now:

should this be mainly a BYO sandbox layer, where teams plug in their existing E2B/Cloudflare/Daytona/internal sandbox?

or should AgentWing also provide a managed sandbox for teams that don’t have one?

curious how others are thinking about this.

if you’re building agents that touch tools, APIs, browsers, files, databases, Slack, GitHub, payments, cloud infra, or deploys:

where do you draw the line between “safe to run automatically” and “must be sandboxed, approved, or blocked first”?

early Runtime Lab here if anyone wants to test/roast the flow:

https://agentwing.gpmai.dev

reddit.com
u/Far_Pangolin_7657 — 12 hours ago

Are we seriously letting AI agents touch files, shells, APIs, and deploys without a control layer?

I’ve been thinking about this while building an early runtime control layer for AI agents.

A lot of agent demos today look impressive: the agent reads files, edits code, runs shell commands, calls APIs, pushes branches, maybe even prepares deploys.

But the scary part is not the planning.

The scary part is execution.

Once an agent can touch real systems, a normal sandbox alone doesn’t feel like enough. You still need a layer that decides:

- is this action allowed?

- should this command run in a sandbox first?

- do we need a restore point before a file write?

- should a human approve this before it hits an external API or deploy?

- what feedback should go back to the agent if the action is blocked?

- how do we audit what happened later?

The mental model I’m testing is something like “2-phase commit for AI agents”:

  1. Agent proposes an action

  2. Control layer checks policy / risk

  3. Risky actions get sandbox replay, restore points, or approval

  4. Agent gets structured feedback and replans

  5. Every action gets an audit receipt

I’m calling the prototype AgentWing.

The goal is not to replace sandboxes like E2B, Daytona, Modal, Browserbase, Cloudflare, or custom microVM environments. The goal is to sit above them and control what enters execution.

Sandbox providers isolate execution. A control layer decides what should be allowed, blocked, replayed, restored, approved, or returned back to the agent as feedback.

I put a working Runtime Lab here for context if anyone wants to test/roast the flow:

https://agentwing.gpmai.dev

Curious how others are thinking about this.

If you’re building or running agents that can touch files, shells, APIs, browsers, databases, or deploys:

Where do you draw the line between “safe to execute automatically” and “must be sandboxed / approved first”?

reddit.com
u/Far_Pangolin_7657 — 1 day ago

AI agents are getting powerful. I’m building AgentWing — a research-grade control layer for their actions

​

For the deeper safety layer, I’m exploring sandbox-and-replay using stronger isolation than normal Docker-style sandboxing — microVM / Firecracker / gVisor-style environments where risky agent actions can be tested first before being allowed to affect the real project.

The idea is not only to monitor agents, but to make their actions observable, policy-controlled, reversible, and eventually safely executable in isolated environments.

SHARE YOUR OPINION HERE.

reddit.com
u/Far_Pangolin_7657 — 4 days ago

I’m building a UE5 MetaHuman (realistic digital human) AI Companion that adapts conversation into gestures, body actions, and voice-ready replies

Hey everyone 👋

I’m building \*\*Companion AI\*\*, a UE5 + MetaHuman based embodied AI system where conversation becomes body language, actions, and presence.

Instead of opening a normal chat window, the user sees a realistic MetaHuman companion in a room. The character can respond through text, voice-ready replies, gestures, emotion, and body actions — with the long-term goal of feeling like a digital person sharing the space with you.

GitHub:

https://github.com/12ziyad/companionai

Current / planned system includes:

\- Unreal Engine 5 + MetaHuman

\- AI conversation pipeline

\- Cloudflare Worker backend

\- LLM API integration

\- contextual gesture and action system

\- animation/action director

\- speech-bubble style replies

\- voice-ready architecture

\- future dancing, singing, expressive actions, and agentic workflow support

The main architecture idea is simple:

\- AI understands the conversation and meaning

\- local behavior logic handles cooldowns, repetition, state, and action safety

\- Unreal executes the final behavior through MetaHuman animations

This avoids hardcoded behavior like \`if user says hi → wave\`.

Instead, the companion should choose actions that fit the moment — greeting, listening, thinking, laughing, celebrating, comforting, or staying still when silence feels more natural.

I’m also accepting early users/testers. I can offer a free full preview in exchange for honest feedback, setup feedback, bugs, and ideas.

For early access, questions, or collaboration:

founder@gpmai.dev

ziyad@gpmai.dev

You can also ask setup questions here.

If this looks useful, a GitHub star would genuinely help the project reach more builders ⭐

u/Far_Pangolin_7657 — 4 days ago

i asked 23 companies how they actually test their AI agents before shipping. the answers genuinely scared me.

spent the last 3 weeks DMing CS leads, ops managers, and PMs at companies running AI agents in production. just one question: "how do you know your agent works before it goes live?"

here's what i found:

17 out of 23 said some version of "we just ship it and watch slack for complaints"

4 used a spreadsheet with manual test cases they run "when they remember"

only 2 had real evals — and both were companies with ML engineers

these aren't tiny companies. one was a 200-person scale-up. another was a YC company in their current batch.

the deeper problem: every eval tool out there (Braintrust, LangSmith, Galileo, even LangWatch which is the closest to no-code) still assumes you write Python or YAML. the people actually deploying agents aren't engineers. so they don't test. so agents break in prod. so Gartner predicts 40% of agentic AI projects get killed by 2027.

i'm building research-grade evals (adversarial test generation, LLM-as-judge with proper rubrics, regression tracking — same techniques anthropic and openai use internally) but you write the rules in plain english. "never recommend a competitor." "always escalate if customer says lawyer." that's the whole UX.

currently in MVP with 2 design partners. learning fast.

couple things i'm curious about:

if your team ships agents, am i wrong about the testing gap? what's your actual workflow

what's the dumbest thing your agent has done in production

if you've used LangWatch — does it actually work for non-eng teams or is the no-code promise oversold

not linking anything. just yapping. dm if it resonates.

reddit.com
u/Far_Pangolin_7657 — 4 days ago

What if users could warn others or share their opinion on every website they visit? I’m building PopGuard

I’m building PopGuard — a community-powered tool where people can post reviews, warnings, and helpful notes about websites.

The idea is simple: if someone visits a website and notices something suspicious, useful, fake, unsafe, or trustworthy, they can post a short review about it.

For example:

“This site asks for payment too early.”

“This looks like a fake login page.”

“I used this site and it was safe.”

“Be careful, it redirects to another page.”

“Don’t click Allow notifications here.”

Then when another person visits the same website, they can see community reviews, ratings, and warnings before interacting with it.

PopGuard is not only about blocking scams — it’s about letting users help each other understand whether a website is safe, useful, suspicious, or risky.

The goal is simple: community reviews and guidance for websites, so people don’t get trapped alone.

Would you trust a tool where users can review and warn others about websites in real time?

Contact for github link founder@gpmai.dev / ziyad@gpmai.dev

reddit.com
u/Far_Pangolin_7657 — 5 days ago

I’m building a UE5 MetaHuman (realistic digital human) AI Companion that adapts conversation into gestures, body actions, and voice-ready replies

Hey everyone 👋

I’m building **Companion AI**, a UE5 + MetaHuman based embodied AI system where conversation becomes body language, actions, and presence.

Instead of opening a normal chat window, the user sees a realistic MetaHuman companion in a room. The character can respond through text, voice-ready replies, gestures, emotion, and body actions — with the long-term goal of feeling like a digital person sharing the space with you.

GitHub:

https://github.com/12ziyad/companionai

Current / planned system includes:

- Unreal Engine 5 + MetaHuman

- AI conversation pipeline

- Cloudflare Worker backend

- LLM API integration

- contextual gesture and action system

- animation/action director

- speech-bubble style replies

- voice-ready architecture

- future dancing, singing, expressive actions, and agentic workflow support

The main architecture idea is simple:

- AI understands the conversation and meaning

- local behavior logic handles cooldowns, repetition, state, and action safety

- Unreal executes the final behavior through MetaHuman animations

This avoids hardcoded behavior like `if user says hi → wave`.

Instead, the companion should choose actions that fit the moment — greeting, listening, thinking, laughing, celebrating, comforting, or staying still when silence feels more natural.

I’m also accepting early users/testers. I can offer a free full preview in exchange for honest feedback, setup feedback, bugs, and ideas.

For early access, questions, or collaboration:

founder@gpmai.dev

ziyad@gpmai.dev

You can also ask setup questions here.

If this looks useful, a GitHub star would genuinely help the project reach more builders ⭐

u/Far_Pangolin_7657 — 5 days ago

I built GPMai - a 100+ model ai workspace with semantic memory and screen aware assistance

Hey everyone,

I built GPMai — a universal AI workspace that brings 100+ text, image, video, and audio models into one place, so you're not jumping between separate AI apps. It pulls together models, files, tools, workflows, screen context, and memory in a single environment.

The piece I'm most excited about is the memory engine. Instead of just dumping chat history into a vector store, GPMai builds a structured memory graph from your conversations — extracting key facts, linking them as nodes and relationships, and surfacing the relevant context in future chats. The goal is simple: project context that actually persists across sessions instead of resetting every time you open a new chat.

A couple of other things worth mentioning:

Screen-aware assistance — the AI can see what's on your screen and help with whatever app or doc you're working in.

Point-based usage — transparent per-model costs so you don't get surprise bills from heavy models.

Website: https://gpmai.dev / GitHub / Memory architecture: https://github.com/12ziyad/GPMai

[Apache-2.0 license]

I’m also opening a few limited early-access slots for people who want hosted GPMai access or want to test it on real AI workflows.

For early access, setup questions, or enquiries, email me at: ziyad@gpmai.dev / founder@gpmai.dev

Would love honest feedback — does this feel genuinely useful, or does it come across as another AI wrapper? And if you like the direction, a GitHub star would genuinely help.

u/Far_Pangolin_7657 — 5 days ago

[GitHub] I built GPMai - a 100+ model ai workspace with semantic meomery and screen aware assistance

Hey everyone,

I built GPMai — a universal AI workspace that brings 100+ text, image, video, and audio models into one place, so you're not jumping between separate AI apps. It pulls together models, files, tools, workflows, screen context, and memory in a single environment.

The piece I'm most excited about is the memory engine. Instead of just dumping chat history into a vector store, GPMai builds a structured memory graph from your conversations — extracting key facts, linking them as nodes and relationships, and surfacing the relevant context in future chats. The goal is simple: project context that actually persists across sessions instead of resetting every time you open a new chat.

A couple of other things worth mentioning:

Screen-aware assistance — the AI can see what's on your screen and help with whatever app or doc you're working in.

Point-based usage — transparent per-model costs so you don't get surprise bills from heavy models.

Website: https://gpmai.dev / GitHub / Memory architecture: https://github.com/12ziyad/GPMai

I’m also opening a few limited early-access slots for people who want hosted GPMai access or want to test it on real AI workflows.

For early access, setup questions, or enquiries, email me at: ziyad@gpmai.dev / founder@gpmai.dev

Would love honest feedback — does this feel genuinely useful, or does it come across as another AI wrapper? And if you like the direction, a GitHub star would genuinely help.

u/Far_Pangolin_7657 — 6 days ago

AI agents are getting powerful. I’m building AgentWing — a research-grade control layer for their actions.

For the deeper safety layer, I’m exploring sandbox-and-replay using stronger isolation than normal Docker-style sandboxing — microVM / Firecracker / gVisor-style environments where risky agent actions can be tested first before being allowed to affect the real project.

The idea is not only to monitor agents, but to make their actions observable, policy-controlled, reversible, and eventually safely executable in isolated environments.

reddit.com
u/Far_Pangolin_7657 — 7 days ago

AI agents are getting powerful. I’m building AgentWing — a research-grade control layer for their actions.

For the deeper safety layer, I’m exploring sandbox-and-replay using stronger isolation than normal Docker-style sandboxing — microVM / Firecracker / gVisor-style environments where risky agent actions can be tested first before being allowed to affect the real project.

The idea is not only to monitor agents, but to make their actions observable, policy-controlled, reversible, and eventually safely executable in isolated environments.

reddit.com
u/Far_Pangolin_7657 — 7 days ago

If this gets 10 useful comments, I’ll start building this AI agent

​

I’m thinking of building an AI coding agent and I want honest feedback before I spend time on it.

The idea is simple:

An agent that helps merge selected modules from multiple GitHub repos into your existing project.

Example:

You already have a React/TypeScript project.

You find:

- a nice dashboard UI in one repo

- a toast notification system in another repo

- an auth flow in another repo

- useful utility functions in another repo

Normally, bringing all that into your project becomes painful because imports break, dependencies conflict, folder structures don’t match, styling is different, and the build starts failing.

The agent would:

  1. scan your existing project

  2. scan the source repos

  3. find the useful modules

  4. create a merge plan

  5. adapt the code to your project structure

  6. run the build

  7. fix errors with a safe retry limit

  8. show a diff before anything is accepted

I don’t mean blindly merging full repos. That would probably be chaos.

I mean safely combining selected modules/components/hooks/utilities into one existing codebase.

I’m thinking of starting with React + TypeScript only.

Would you use an agent like this?

What would make you immediately not trust it?

If this gets 10 useful comments, I’ll start building a small MVP and share progress.

reddit.com
u/Far_Pangolin_7657 — 7 days ago