r/Agent_AI

Here’s the worlds first ai native agentic operating system
▲ 18 r/Agent_AI+6 crossposts

Here’s the worlds first ai native agentic operating system

I was the creator of VIB OS - worlds first vibecoded operating system.

finally pushed TensorAgent OS public today after way too many late nights so here it is, so many people from this community was asking me for the release. It’s going to help everyone speed up there workflow, this is the beginning of a new era in AI

the short version: the AI agent IS the shell. not a chatbot widget floating over your taskbar, the agent is literally the interface. you talk to it, it talks back, it runs things, drives the browser, controls your hardware. thats the whole idea.

It’s built on top of the Openwhale AI engine.

easiest way to try it is the prebuilt UTM bundle on apple silicon, just double click and boot. QEMU works too. default login is ainux / ainux.

real talk on where its at:

x86_64 doesnt boot cleanly yet, ARM64 only right now (UTM/QEMU on mac)

QML shell crashes on resize sometimes, known issue

agents ocasionally hang on tool calls

cloud-init can get stuck on first boot, give it like 10 min

no installer, boots live

its a research prototype, not something you should put on your main machine. but if you wanna hack on an actual AI-first OS and dont mind the ocasional segfault, come break stuff and file issues. PRs are especially welcome on the x86 boot pipline and new skills.

repo: https://github.com/viralcode/tensoragentos

u/IngenuityFlimsy1206 — 1 day ago
▲ 4 r/Agent_AI+1 crossposts

What do you think about Agents orchestration using Skills ?

Hi everyone,

This is my first post here so apologies if I broke any rules with this post!
So that said, here's the discussion that I wanted to start here: I am currently working on a POC project in my company that aims to explore the feasibility of Agents orchestration using Skills.

The idea here is that discovering all sub-agents using MCP (already done) eats up a lot of context (as you know) since these are loaded at start and are always part of the context.

Hence why we thought about Skills as a way to perform "universal" (individual) agents discovery (which would be applied to new ones that would be created in the future) and a way to "lazy load" the (individual) agent tools when needed (when it is called through a Skill for example).

The end goal would be to build a product that could leverage multiple (existent Agents that are running at scale and exposing MCP servers/tools) to answer a user request (that cannot be done using a single agent but rather by doing back and forth between these agents combined).

The only constraint here is that the exploration is done using Microsoft Agentic Framework even though we all know Skills here are a language agnostic concept.

Anyway, I am looking for ideas/suggestions/anything that can spark a discussion/brainstorming on my side as I've already managed to create skills and chain call them for a simple multi-agent purpose (just a simple textual case not really the agents that I mentioned above).
Thanks you!

reddit.com
u/maher_bk — 19 hours ago
▲ 4 r/Agent_AI+1 crossposts

Looking for feedback on how to position my offering, and feedback on recent claude design visualization I launched yesterday

I am looking for advice and feedback on how to combine these three pages, into a single coherent narrative that will appeal to both CTO/CDO/Founders for our agents + agentic offering we have launched.

In theory, our offering is pretty straight forward, we built a lot of custom Agents for our enterprise clients. I am not sure the message is coming across cleaning in these pages:

  1. We generalized those agents and now offer them out the box for smaller clients, new clients, and individuals who want to use them

  2. We launched a new do it yourself agent builder for enterprise clients who want to build their own custom agents

  3. We offer consulting services to help enterprise clients build their own custom agents

  4. We just launch this week composability for combining those agents into a single agentic flow.

I want to combine these three pages into a single coherent narrative that will appeal to both CTO/CDO/Founders for our agents + agentic offering we have launched.

I currently have 3 pages that describe this offering.

  1. Our overview page that explains our out of the box agents, and our agent builder. I designed this page entirely in Figma, and I personally like it alot. However, would love to get other peoples thoughts on it.

{See comment below for link}

  1. We have our internal agents page that exists within our app, this one I had more room to try different things, as it was in the app as a stand alone page. In many ways it is very strong, and has a lot of good ideas, but it is also doesn't fit the overall style of the website.

{See comment below for link}

I built a really interesting visualization for the composability feature, that I think could be a really good hook to draw people in using Claude Design (just launched mobile version of it today). However, I just posted it as is, so there is a little disconnect between its style and the rest of the site.

{I would love to know what the community thinks of it, and I should have been clear about this yesterday, but mobile for that page was not ready. It is ready today. So, if you check it yesterday on mobile, and you had a broken page, I have updated that as of today.}

{See comment below for link}

Anyway, any thoughts or feedback anyone has would be greatly appreciated. And yes, as I mentioned to some people yesterday, we are still streamlining our exact offering around agents and how to position it. My sense, is most people on this page, have gone through these challenges. Would love you feedback, reply to this post, DM, whatever works best for you. Obviously, would be more than willing to give free access to the agents to anyone who wants to try them out if they could help me with this.

reddit.com
u/Ok_Technician_4634 — 13 hours ago
▲ 5 r/Agent_AI+4 crossposts

I create a personal health agent that work with your apple health

I built HiMe, a personal health agent that ingests your wearable data in real time and proactively delivers insights around the clock. You can interact with your agent in an OpenClaw-style experience via Telegram or Lark.

The system is supported by a local AI agent server, along with an iOS companion app and an Apple Watch app, enabling seamless real-time syncing of wearable data. I also created a pixel-art cat, HiMeow, which acts as your personal health digital twin within the iOS app. When you’re tired, HiMeow appears sleepy; when you’re well-rested, it becomes lively and energetic.

reddit.com
u/thinkwee2767isused — 11 hours ago

Alex Gerko, The Billionaire Who Built AI-Powered Trading Machine

Alex Gerko's trading firm XTX Markets has become one of the most profitable players in algorithmic trading by making an early, all-in bet on AI — and is now spending over $1 billion to cement that advantage with its own data centre infrastructure.

Key Details:

  • Gerko, a 46-year-old British-Russian mathematician with a PhD, founded XTX in London in 2015 after early stints at Deutsche Bank; he owns at least 75% of the firm and has an estimated net worth of $12 billion.
  • XTX built its AI trading system years before ChatGPT, using deep learning models to predict price movements across stocks, bonds, FX, derivatives, crypto, and recently electricity — trading an average of $250 billion daily.
  • The firm's UK revenue rose 44% to $5.3 billion in 2024, with profit up 33% to $2.3 billion, achieved with only ~250 employees and no outside investors.
  • Rather than renting computing capacity, XTX is self-building a $1 billion+ data centre complex in Finland — the first of five planned sites — to ensure cheap, independent AI model training as global demand for compute surges.
  • The firm has amassed 25,000 Nvidia AI chips; its Finland site spans three football fields, with a second facility largely built and due online in 2027.
  • Gerko has long rejected high-frequency trading's speed race, instead lobbying for speed bumps to disadvantage ultrafast traders — an unusually public stance in a secretive industry.
  • Rivals are also spending heavily: Jane Street recently signed a $6 billion deal with CoreWeave for AI infrastructure, signalling an industry-wide arms race around compute power rather than connection speed.

Why It Matters: XTX's infrastructure bet reflects a broader shift in quantitative finance — the competitive edge is no longer who can trade fastest, but whose AI models are most accurate, and who owns enough compute to keep improving them.

u/Money-Ranger-6520 — 16 hours ago

ChatGPT Images 2.0 is remarkable for creating infographics

Prompt here was "Animal infographic of a legendary beast that does not exist."

u/Money-Ranger-6520 — 17 hours ago
▲ 7 r/Agent_AI+2 crossposts

FEEDBACK REQUEST: Claude Design: Extremely impressed with how it built visualization of our mult-agent orchestration but want to get others people feedback

I rebuilt a visualization from our multi-agent orchestration page using Claude Design, and decided to launch it as is, without doing massive amount of rework.  This is the first time i have been able to post something directly from the any design LLM, without doing additional work.

https://www.datagol.ai/multi-agent-orchestration

I am really curious what people think of this.  I want want honest feedback, if you think it sucks, tell me.  Is it to much detail, or not enough.  I tried to replicate what our actual multi-agent flow looks like, so let me know if you think it works??

What I did: Instead of manually laying out every element, I provided:

  • the core prompt and specification generated from the agent
  • the dataset behind the visualization
  • the intended plan our internal agent came up with.  
  • The key element was it was able to use its own internal agents to answer the question and use the plan, which was extremely cool to see

Claude handled the layout logic and visual structure from there.

Curious what others think, especially those experimenting with Claude Design:

  • Does the visualization feel structurally clear?
  • Does the flow of agents make sense at first glance?
  • Where does it feel over-specified or under-explained?
u/Ok_Technician_4634 — 1 day ago
▲ 1 r/Agent_AI+1 crossposts

Gaming laptop and ai

I finally got a good laptop to start making money. I have never had a laptop worth over a couple hundred and lately I waste so much time just waiting for it to load. This was $1400 and I am ready to pursue my goals now that it’s possible. Any advice going forward in regarding to keeping this laptop running g in mint condition?! Should I buy Norton or mcfaee?? Thank you!!

reddit.com
u/Recent_Historian_387 — 19 hours ago

From Silent Failures to 97% Faithfulness, Built Agentic Multilingual RAG — RAGAS Eval + LangGraph

Over the last 2 months, I built SmartDocs by doing something most teams avoid because it's painful, slow, and breaks everything you've already built.

Standard RAG pipelines fail on real Indian documents in specific, reproducible ways. The failures are silent and the system returns fluent answers grounded in weak retrieval.

This post documents the failure modes, the architectural decisions used to address them, and measured RAGAS results on a Hindi ↔ English pipeline.

✓ Measured results (RAGAS evaluation):

Metric Result

Hindi Faithfulness 97%+

English Faithfulness 90%+

Hindi Answer Relevancy 90%+

Context Precision 98%+

Faithfulness Ratio (Hi/En) 0.97

Hallucination Rate <5%

P95 Retrieval Latency <12s

Language Accuracy 95%+

✓ Failure taxonomy:

Language detection breaks on short queries

Statistical models misclassify “transformer kya hai” before retrieval begins

Fix: deterministic script + lexicon routing using Unicode ranges

BM25 fails completely on Devanagari

Tokenizers fragment Hindi text → zero retrieval coverage

Fix: Indic-aware tokenization aligned with Unicode script blocks

Dense retrieval degrades on code-mixed text

Mixed Hindi-English sentences fall outside embedding distribution

Fix: hybrid dense + sparse retrieval fused via RRF (k=60)

Exact-match blindspot in embeddings

GSTINs, section codes, numeric thresholds are not represented semantically

Fix: BM25 handles lexical matches, reranked with dense outputs

PDF extraction noise

ZWJ/ZWNJ and Unicode variants create invisible mismatches

Fix: NFKC normalization during ingestion

✓ Full Pipeline:

Ingestion → Indic preprocessing → script-aware chunking → embedding

Query → deterministic routing → multi-query expansion

Retrieval → hybrid (E5 + BM25) → RRF → reranking

Reasoning → LangGraph state machine

Validation → faithfulness + language checks + retries

Runs locally on RTX hardware.

This repository is structured as a reusable pipeline, not a demo.

If you’re working on multilingual retrieval, legal/financial RAG, or code-mixed language systems, this can serve as a base layer:

- fork and test on your own data

- modify retrieval or embedding strategies

- replace components and benchmark against this setup

Serious feedback from people building similar systems especially around retrieval, embedding alignment, and evaluation would be valuable to push this further.

u/Agent-Orchestrator — 11 hours ago
▲ 4 r/Agent_AI+1 crossposts

Where do you think the future of agents is going?

It feels increasingly clear that we want agents to be autonomous, continuously running, and cheap enough to use all the time.

Do you think that future is mostly local agents running 24/7 on personal devices, or mostly cloud-based agents?

And has anyone here actually run agents continuously for days or weeks? Curious to hear real-world experiences: cost, reliability, limitations, and whether it was actually useful.

reddit.com
u/Fragrant-Drummer-472 — 2 days ago

As a non-tech person, what are the most helpful AI agents you found?

Hey all, I'm new in this journey, non technical, but I want to adopt new tools to get more things done this year. Can be in any aspects, email marketing, lead outreach, ads making... as long as it truly deliver results. Would be great if you can share how you set up and use them

For context, here's what I'm using so far:

  • Claude: my LLMs for drafting, deep research, and writing.
  • Gemini: I use it for content ideas and creating images mostly, but with openAI image 2, things may change
  • Exa, Clay, Manus: I use them to find and enrich leads quicker
  • Granola: I use this to take meeting notes
  • Saner: I use it to manage notes, tasks, and calendar

What's the most helpful and easiest AI you've used so far?

reddit.com
u/FreshFo — 2 days ago
🔥 Hot ▲ 273 r/Agent_AI

How to set up Claude Skills completely in 1 hour:

Just found this on X, and decided to share it here. Credit: Ruben Hassid

u/Money-Ranger-6520 — 3 days ago

Why are we so paranoid about AI agent security, but totally fine with traditional software?

I’ve noticed a weird double standard lately. People act like giving an AI agent API access is a system-ending security risk, but they’ll happily install a random productivity Chrome extension that has permission to read and change all their site data. We’ve been giving random SaaS integrations full access to our Gmail and Slack for years, yet the moment an AI is involved, everyone starts panicking.

I get that the fear comes from the autonomy factor, the idea of an AI making decisions while you’re asleep feels jarring. But realistically, a poorly coded script or plugin is just as capable of nuking your database. I get that this is anecdotal, but my agency has been moving our workflow over to a dedicated SEO agent setup using QuickCreator, and it’s been smooth sailing so far. At the end of the day, I feel like you have to treat an agent exactly like you’d treat any other enterprise software. Do your due diligence and set preventative

I guess my question is, what makes security risks in AI agents that much different than the typical software we use then? Why are people so panicky over it?

reddit.com
u/LogWest5630 — 2 days ago
▲ 1 r/Agent_AI+1 crossposts

I built a tool to compare AI models side-by-side — would love honest feedback

Hey everyone,

I’ve been using a lot of AI tools (ChatGPT, Claude, etc.) and kept running into the same issue:

I never knew which one would actually give the best answer.

So I’d end up opening multiple tabs, pasting the same prompt over and over, and comparing manually. It worked… but it got pretty repetitive.

So I built something simple for myself.

It lets you enter one prompt and see responses from 40+ AI models side-by-side.

What surprised me most is how different the answers can be depending on what you’re asking. For things like:

  • writing
  • coding
  • explanations

…the “best” AI isn’t always the same.

I recently opened it up for others to use (it’s $10/month mainly to cover costs), but honestly I’m not here to hard sell anything.

I’m more interested in learning:

  • Do you actually compare AI tools, or just stick to one?
  • Would something like this save you time, or not really?
  • What would you want to see in a tool like this?

If anyone’s curious, I can share the link — but I’d genuinely appreciate feedback more than anything.

Thanks 🙏

reddit.com
u/Frosty_Conclusion100 — 2 days ago
🔥 Hot ▲ 78 r/Agent_AI+1 crossposts

I am extremely confident that GPT-5.4 has been intentionally throttled in the last few days

Over the last 48 hours I've noticed a substantial decline in functional ability in the model. It fails to use the basic tools that I've built for it in my repo that it previously used to one-shot new problems and features in under 20 minutes. Yesterday, and all through today it has repeatedly fumbled basic tasks and failed to understand how to use basic tools even when given highly explicit directions in the prompt (far more explicit than I needed to give it before).

What makes me go from "suspicious" to "confident that it's been throttled" is that it's not taking the time to remember or understand assignments. It defaults to guessing blindly even when told not to. It's been getting stonewalled by medium-difficulty problems.

There are entire classes of already solved/implemented features/bug-solutions that already exist in my code base that it fails to utilize in it's problem-solving even when pointed at them. The sudden drop in quality is large enough absolutely undeniable from my perspective.

reddit.com
u/shockwave6969 — 4 days ago
▲ 16 r/Agent_AI+5 crossposts

I create the awesome list for how to train a LLM Agent

Introduce AgentsMeetRL, a GitHub awesome list repo.

Not just prompting, but actually using reinforcement learning to train agentic LLMs.

273 repos across 16 categories. 327.8k total stars. To my knowledge, this is the first awesome list focused on RL for LLM agents, and it’s been actively maintained for a year.

It spans everything from base frameworks to specialised agents, covering memory, self evolution, and environment design. Each entry includes the paper, GitHub repo, affiliation, star count, and key technical choices such as scaffold design, RL algorithm, reward type, and agent behaviour mode.

PRs and issues are very welcome if something’s missing or could be improved.

reddit.com
u/thinkwee2767isused — 3 days ago
▲ 10 r/Agent_AI+1 crossposts

I stopped using screeners and just talk to Claude now. My watchlist process has never been faster.

For the past few weeks I've been doing my pre-market prep almost entirely through conversation. I open Claude, describe what I'm looking for — something like "show me large caps with RSI under 35 that are still above their 200 SMA and have increasing volume" — and it just… does it. Gives me names, explains the setups, flags the ones with weird fundamentals or recent news.

I have an MCP set up that feeds Claude real-time market data — price, RSI, MACD, Bollinger Bands, ATR, support/resistance levels, candlestick patterns, short interest, fundamentals, the whole thing. So it's not hallucinating numbers, it's actually pulling live data and reasoning over it.

I'm still doing my own analysis and making my own calls — Claude is not placing trades for me, let's be clear about that. But having a research assistant that actually understands what I'm asking and has access to real data? It's like having a junior analyst who never sleeps.

Curious if anyone else here is using AI as part of their trading workflow. What does your setup look like? Would love to compare notes.

reddit.com
u/Outrageous_Fall_1727 — 3 days ago
▲ 3 r/Agent_AI+1 crossposts

I am starting my journey on AI agent, as a developer what are the tools I should use?

Can I build agent or use the existing tools for my coding journey to help me out and what are the agents available for that and if I ever decide to build them myself, should I consider building one or just use existing tools

reddit.com
u/AccomplishedPath7634 — 3 days ago