r/LangChain

The 4-line function that fixed my agent's wrong answers (conditional edge in LangGraph)
▲ 11 r/LangChain+2 crossposts

The 4-line function that fixed my agent's wrong answers (conditional edge in LangGraph)

My ReAct agent gave wrong answers for a week. It would call a tool, get a result, and immediately answer without checking if the result made sense.

The fix was a conditional edge — 4 lines:

    def conditional_edge(state: MessageState):
        last_message = state["messages"][-1]
        if last_message.tool_calls:
            return "tool"
        return END

Without it: LLM → tool → answer (one shot, no self-correction)

With it: LLM → tool → check → loop back if needed → answer

Full repo (67 lines total): https://github.com/dunjeonmaster07/react-agent

What other simple patterns made a big difference in your agent's reliability?

u/Low_Edge7695 — 5 hours ago
▲ 2 r/LangChain+3 crossposts

.md files are not Memory

A folder of .md files is not memory.

It’s a storage dump.

Useful AI memory needs more than “search old notes and pray”:

- semantic recall, so related ideas surface even when wording differs

- entities, different terms for the same thing don’t become random blobs

- relationships, so the system knows how things connect

- provenance, so it can trace where facts came from

- correction + forgetting, because stale memory is worse than no memory

- background consolidation, because raw chat logs are mostly sludge

Thoth uses a local personal knowledge graph + FAISS semantic search + graph expansion + document ingestion + wiki export.

So yes, you can still get readable notes.

But underneath, the assistant isn’t just rifling through markdown like a raccoon in a filing cabinet.

It’s building structured personal context it can retrieve, update, connect, and reason over.

That’s the difference between “I saved your notes” and “I actually know what matters.”

Relevant references:

  1. FAISS docs: efficient similarity search and clustering of dense vectors.

    https://faiss.ai/

  2. Microsoft GraphRAG: combines text extraction, network analysis, LLM prompting, and summarisation for richer understanding of text datasets.

    https://www.microsoft.com/en-us/research/project/graphrag/

  3. GraphRAG survey on arXiv: graphs encode heterogeneous and relational information, making them useful for retrieval-augmented generation.

    https://arxiv.org/abs/2501.00309

  4. Thoth README memory features: personal knowledge graph, typed relations, FAISS semantic recall, graph expansion, document extraction, wiki export, Dream Cycle refinement.

    https://github.com/siddsachar/Thoth

u/Acceptable-Object390 — 19 hours ago
▲ 9 r/LangChain+3 crossposts

Free RAG Interview Q&A repo with all 10 types of RAG. 50 questions with detailed answers, difficulty tags, and a decision tree. Contributors welcome!

Hey everyone,

I've been going deep on RAG architectures lately and couldn't find a single resource that covered all the modern variants in one place, so I built one and open-sourced it.

What's in the repo:

  • 10 sections covering every major RAG type
  • 50 interview questions tagged [Basic] / [Intermediate] / [Advanced]
  • Detailed answers with architecture diagrams, code snippets, and trade-off tables
  • A cheatsheet with a decision tree ("which RAG should I use?")
  • GitHub Pages site auto-deployed on every push

RAG types covered: Naive, Advanced, Modular, Agentic, Graph, Corrective (CRAG), Self-RAG, Speculative, Multi-modal, and Long-context RAG.

https://github.com/ather-techie/rag-interview-questions

Looking for contributors! If you've been in an ML/LLM interview recently and got a question not covered here, please open a PR or drop it in the comments. I'll add it with credit.

If this is useful, a star on GitHub goes a long way. it helps others discover it. Thanks!

u/Western-Slip199 — 15 hours ago

We tested single-agent vs multi-agent on a real enterprise task. Single agent was 10-20x cheaper and the only one that got the right answer.

I'm building an open-source multi-agent framework and spent last few weeks testing it against a real Enterprise solution design task — not a toy benchmark, an actual enterprise ticket requiring cross-referencing Jira comments, Java source code, Process Flow Config XMLs, and Confluence design docs to produce a correct technical document.

The setup:

  • 4 specialist worker agents (Jira researcher, code analyst, config analyst, docs researcher) coordinated by an architect agent, with a synthesizer combining everything
  • Each worker had focused MCP tools for their domain
  • We tried 4 different multi-agent configurations over multiple days

What happened with multi-agent (4 attempts):

Attempt Core Error
1 Invented an Attributes that doesn't exist
2 Misclassified the ticket as a different initiative entirely
3 Got the actual Ticket intention wrong.
4 Imported scope from a different ticket (which had similar name)

Each attempt used 30,000-70,000 tokens across different tools and agents. Each made a different fundamental error.

What happened with single agent:

  • One agent with ALL tools (Jira + code + CDT + Confluence + output) in one context window
  • Kimi K2.6 (cheap model, $0.73/1M input)
  • Only 3,454 tokens total
  • First doc to correctly identify the actual problem, name the right code sites, quote the right Jira comments, and recommend fixes.

It wasn't perfect but only needed minor fixes to make the solution workable.

Based on all the agent logs and traces which were captured at each agent level, here's my understanding of why multi-agent failed:

The task required connecting dots across multiple sources. A Jira comment mentions a class name -> read that class -> find it references a Config XMLS -> fetch that config -> discover a condition that gates the behavior the ticket wants to change. This chain of reasoning needs to happen in ONE context window.

Bue what wsa happening with multi-agent:

  • Worker A finds the Jira comment but doesn't know about the code
  • Worker B reads the code but doesn't know which Jira comment matters
  • Worker C fetches Process Flow Config XMLss but doesn't know which code path to trace
  • The architect gets summaries from each and tries to connect them — but summaries lose the specific details that matter

Information is really getting destroyed at every handoff. The architect is reasoning over shadows of the actual data and a lot of information was not even fetched because full information was never in the single context to work on.

This experience gave me the insight on when multi-agent just doesn't work:

  • Anything requiring cross-source reasoning (solution design, root cause analysis, debugging)
  • When the total data fits in one context window (most enterprise tasks)
  • When coordination cost (token overhead, summarization loss) exceeds the parallelism benefit

When multi-agent DOES make sense:

  • True parallelism on independent tasks (monitoring multiple services, processing document batches)
  • Scale beyond one context window (millions of log lines need filtering before reasoning)
  • Each agent has a genuinely independent domain (home automation: lighting agent, HVAC agent, security agent)

The final takeaway I could get from the full experiment would be:

The value isn't in agent count — it's in good tools and skills that give the model the right context. MCP servers, structured search, code reading tools — these are what made the single agent succeed. Adding more agents just added more ways to lose information.

Multi-agent is a tool, not a goal. Use it when parallelism genuinely helps. Default to single agent with good tools for anything requiring deep reasoning.

What have been your experience with multi-agents and where have they really worked and where they have failed.

reddit.com
u/ksrijith — 21 hours ago
▲ 3 r/LangChain+1 crossposts

Building an AI agent with OpenAI tool use — struggling with consistency. How do you enforce tool call order reliably?

Hey,

Software engineer here, relatively new to agentic workflows. Building a production AI concierge — user says "I'm going to Budapest tomorrow, plan my day" → agent searches our offer database, builds a plan, user books everything in one flow.

**Stack:** OpenAI GPT-5.5 + tool use, NestJS, SSE streaming, React Native. Tools: `search_offers`, `get_offer_details`, `calculate_price`, `prepare_booking_bundle`.

**The problem:** Consistency. Two main issues:

- Model hallucinates offers from training data instead of calling `search_offers`. It knows a lot about European tourism and just... uses that knowledge instead of querying our DB.
- Tool chains break mid-flow. After `search_offers` returns results, model sometimes responds in plain text instead of continuing to `get_offer_details` → `calculate_price`.

Tried explicit prompt rules, `__next` instructions embedded in tool results, reducing tool count. Helps but doesn't fully solve it.

**Questions:**
- What frameworks/tools are you using for production agentic flows?
- How do you enforce tool call sequences reliably?
- Any techniques for preventing hallucination in tool-use agents specifically?

Appreciate any advice from people who've shipped this stuff in production.

reddit.com
u/nightb0rn33 — 17 hours ago

I started to learn LangChain/Langgraph and it seems like LLMs/agents already doing a lot of the things out of the box. is it still worth learning?

Is Langgraph/langchain still worth learning with the current progress of LLMs?
Do you build any projects with that still?

reddit.com
u/ConfidenceNew4559 — 20 hours ago
▲ 2 r/LangChain+1 crossposts

Built production LangChain + Chainlit apps what are you shipping and where does it break?

Been using Chainlit with LangChain for a while now on production legal AI apps — streaming agent responses, multi-step tool calls, the whole thing. Curious what others in this community have built with this combo and where the pain points are.

For me the rough edges have been:

  • Auth in embedded/Copilot mode when the parent app already handles auth — the password_auth_callback flow gets messy fast
  • Chat history persistence since LiteralAI shut down — self-hosting their open-sourced data layer works but it's extra ops nobody budgeted for
  • WebSocket disconnects under moderate load — Chainlit drops connections and there's no built-in session recovery, you have to roll your own
  • Debugging LangChain agent steps inside Chainlit's step visualizer when chains get deep — it can get noisy
  • Mounting Chainlit inside an existing FastAPI app — the ASGI mount patterns are barely documented

What have you shipped? RAG pipelines, agents, internal tooling? And what forced you to reach for a workaround or abandon Chainlit entirely for something else?

PS: Claude helped me to write this as it knows my pain points while building with chainlit.

reddit.com
u/umairmehmood — 15 hours ago
▲ 3 r/LangChain+2 crossposts

Camada de governança ORKA para agentes de IA

>ORKA — governance layer for AI agents

If you're running AI agents in production, you've probably run into at least one of these:

— An agent did something unexpected and you had no way to trace why
— You needed to prove to leadership or compliance what your agents are actually deciding
— A sensitive action happened that should have required human approval first

ORKA solves this. Full audit trail, policy engine, and human-in-the-loop approvals — works with OpenAI, Claude, LangChain, Firecrawl. Instruments on top of your existing stack, no rebuild.

Used by teams in production today. Free plan available.

orka.ia.br

reddit.com
u/MarzipanKlutzy9909 — 14 hours ago

Built a Clinical Research Orchestrator with LangGraph – Critic loop, HITL, and stateful multi-agent flow (open source)

Hey r/LangChain,

Just open-sourced a multi-agent research system built with LangGraph.

**What it does:**

You give it a complex clinical/research question. A network of AI agents

(Orchestrator → Researcher → Critic → Writer) researches the topic, critiques

data quality, loops back if insufficient, and only generates the final report

after human approval (HITL).

**Key architectural decisions:**

- LangGraph over CrewAI — explicit control over edges, state transitions, and interrupt points

- `operator.add` on `research_data` — append-only accumulation across critic revision cycles

- `interrupt_before=["writer"]` — human approves before report generation (true HITL)

- DeepSeek via OpenAI-compatible API — cost-efficient drop-in for GPT-4

**Stack:** LangGraph · LangChain · DeepSeek · Tavily · Pydantic · Python

The repo includes a real example output (clinical_report.md) generated with:

*"Latest evidence on semaglutide for obesity treatment in CKD patients"*

GitHub: https://github.com/Armandogith/langgraph-research-orchestrator

Happy to discuss the architecture — particularly around the critic loop design

and state checkpointing. What patterns are you all using for quality control

in multi-agent pipelines?

reddit.com
u/Correct_Manager_7034 — 20 hours ago

How to handle costs?

Hey everyone, my projects were tipping along with a small user base, but there has been a bit of an uptick in signups, which is great! But how might I manage costs?

I am slightly worried that costs might explode and erase my runway, any help appreciated!

reddit.com
u/Prestigious_Work_632 — 21 hours ago

What happens when LangGraph.js runs directly inside the browser?

I used to mostly work with the Python side of LangChain/LangGraph.

Then I started experimenting with LangGraph.js directly in the browser while exploring WebMCP, and I wanted to see what it would look like to wire WebMCP tools into a LangGraph.js agent flow.

That slowly turned into Brow: a WIP open-source Chrome side-panel agent that runs in the real browser session.

The goal was to see how far I could push an agent that runs client-side, close to the page, instead of relying only on a backend or an external automation layer.

Brow can already:

  • work with both closed frontier models and local/open-source models, using Claude/OpenAI providers or OpenAI-compatible endpoints with custom base URLs
  • chat with an agent directly in the Chrome side panel
  • run the agent flow client-side in the browser using LangGraph.js
  • use the current page and browser context
  • discover WebMCP tools exposed by websites
  • wire WebMCP tools into the LangGraph.js agent flow
  • connect to remote MCP servers
  • render MCP Apps directly inside the chat
  • use browser automation tools like click, type, scroll, tabs, screenshots, etc.
  • record workflows and show them to Brow as reusable context
  • use reusable skills to help the agent adapt to specific tasks and websites

For this kind of project, using LangGraph.js directly in the browser is interesting because the agent can live much closer to the actual page: page context, browser tools, WebMCP tools, MCP servers, and UI rendering can all be connected from the extension runtime.

This is still experimental, imperfect, and very much a work in progress.

It started as a side project, built in the quiet hours after work and family time, one tired-but-curious commit at a time.

Small note about the video: it goes a bit fast in some parts, so don’t hesitate to pause. Video editing is definitely not my area of expertise, I mostly wanted to show the current state of the project as clearly as I could.

I’d love to get feedback from people using LangChain or LangGraph, especially on browser agents, client-side agent orchestration, WebMCP/MCP integration, and what kind of use cases this could unlock.

And if anyone is interested in this direction, contributions are very welcome. I’d love to find motivated people who see potential in this and want to help shape it into something bigger than a solo side project.

GitHub:
https://github.com/Shijou87/Brow

u/shijoi87 — 20 hours ago

AI Engineer Here: Are Regulated Teams Actually Reading Their Cloud LLM Terms?

Been thinking about something that keeps coming up in conversations with compliance and security teams at regulated firms, and I'm curious whether others are seeing the same thing.

I Had an interesting conversation with a compliance lead at a financial services firm last week and he was pretty confident their cloud AI vendor was handling their documents safely. They had DPA signed, opt-out enabled and the vendor was SOC 2 certified.

I asked if they knew what was being logged during inference and who at the vendor could access those logs and They didn't know.

It got me thinking about how narrow the training opt-out commitment actually is and how little people actually know about it. It says your data won't train future models but nothing about inference logging, shared GPU tenancy, log retention schedules or what happens if the vendor gets a government subpoena. Because those governed by separate policies.

Curious how others in regulated environments are actually handling this. Are your teams making a deliberate architectural decision here? Are you aware of the risks?

reddit.com
u/MiserableBug140 — 20 hours ago

I'm building a dead-simple monitoring tool for AI agents — would you use it?

Hey r/LangChain,

I'm working on a lightweight tool to help developers monitor their AI agents in production.

The problem I'm trying to solve: when your agent fails or behaves weirdly, you currently have no easy way to see exactly what happened — which steps it took, where it went wrong, what it cost you in tokens.

My solution is basically a "black box recorder" for AI agents. You add one decorator to your code:

python

u/trace
def run_my_agent(user_input):
    # your existing code, untouched

And you get a dashboard showing:

  • Every step your agent took
  • Where it failed and why
  • Cost per run
  • Alerts when something breaks

Works with any model (OpenAI, Claude, Gemini, Llama) and any framework (LangChain, LangGraph, raw API calls).

Before I build this — would you actually use it? What's your biggest pain point when debugging agents in production?

Genuine feedback only, happy to be told this already exists or nobody wants it!

reddit.com
u/radiyap — 1 day ago
▲ 7 r/LangChain+5 crossposts

What are your biggest pains running AI SDK apps in production?

I'm trying to understand what teams building with AI SDKs struggle with the most once their app is in production.

So far I've heard a few things come up. Some people don't know which model to pick for each task and don't have a week to benchmark everything. Others mentioned costs creeping up but struggling to switch to cheaper models without breaking quality on edge cases.

I'd love to hear what's on your list. If you have 30 seconds, please drop your top 1 or 2 pains in the comments with a bit of context.

reddit.com
u/stosssik — 1 day ago
▲ 3 r/LangChain+1 crossposts

I built a self-hosted authorization runtime for AI agents and MCP tools

I built a self-hosted authorization runtime for AI agents and MCP tools

I’ve been experimenting with running local agents that have:

- shell access

- filesystem access

- MCP tools

- database connectivity

One thing that started bothering me was that most current agent stacks rely heavily on prompts for operational safety.

So I built CapFence:

an OSS deterministic policy runtime that sits between agents and downstream systems.

It evaluates tool calls before execution using local capability policies.

Examples:

- block destructive shell commands

- restrict filesystem access outside a workspace

- require approval for sensitive operations

- replay historical execution traces against updated policies

The MCP gateway support ended up being especially useful because it can proxy/intercept stdio tool calls transparently.

Still early, but I’d appreciate feedback from people self-hosting long-running agents or MCP setups.

Repo:

https://github.com/capfencelabs/capfence

Easiest way to give your local agents reliable web scraping capabilities (Markdown MCP)

Building local agents is fun until you need them to actually interact with the live web. Most simple `requests.get()` fail on modern SPAs, and running a full Playwright stack alongside your LLM eats up RAM and dev time.

I wanted a plug-and-play solution, so I built a Markdown Scraper MCP Server.

Why I made this:

  1. Clean Markdown:LLMs need clean text, not messy HTML. This converts any URL directly to AI-ready Markdown.
  2. Bypasses JS/Captchas: Handles the annoying rendering stuff behind the scenes.
  3. Pay-as-you-go: I hated the $50/mo minimums on standard scraping APIs. This runs on a micro-transaction model ($0.005/request). You throw in $5 and it lasts for months of prototyping.

It uses the Model Context Protocol, so you can integrate it instantly with any compatible framework or client (Cursor, Claude Desktop, etc.).

Repo/Setup instructions here: https://github.com/guimaster97/api_scraper_markdown

Let me know what kind of agent workflows you guys are building that need web access!

▲ 21 r/LangChain+1 crossposts

Curated list of AI-powered web scraping tools

Was researching AI scraping tools for a project and noticed the existing awesome lists either cover traditional scraping (Scrapy, BeautifulSoup) or web agents broadly. Couldn't find one focused specifically on LLM-powered scraping, so I put one together.

Covers frameworks (Crawl4AI, Scrapling, ScrapeGraphAI, llm-scraper), hosted APIs (Firecrawl, Jina Reader, Diffbot), browser infrastructure for AI agents, MCP servers, and search APIs built for LLMs.

Open to more what am I missing?

github.com
▲ 26 r/LangChain+13 crossposts

Ask questions across your Markdown notes using a fully local Graph RAG engine. Built for Obsidian vaults, works with any folder of Markdown files. Extracts entity-relation triples from wikilinks & YAML frontmatter, retrieves answers via hybrid search (vector + BM25 + temporal). Multilingual. No cloud. Runs on Ollama.

https://github.com/benmaster82/Kwipu

u/WritHerAI — 2 days ago

Criei uma camada de auditoria e governança para agentes de IA – verificações de políticas, fluxos de aprovação humana e registro de auditoria imutável.

Disclosure: I'm the developer of this project.

AI agents are taking actions without any oversight – calling APIs, writing data, triggering workflows. Most teams have zero visibility into what's actually happening.

I built ORKA to fix that.

ORKA sits between your agents and the outside world:

→ Every action goes through a policy check before executing

→ High-risk actions pause and wait for human approval

→ Everything is logged in a cryptographically chained audit trail

→ Real-time dashboard with risk scores per agent

It supports MCP, A2A, REST, and custom agent protocols.

Currently in private beta – free to use, no credit card required.

Search "ORKA governance AI agents" on GitHub or "orka.ia.br" to find it.

Would love feedback from anyone building with AI agents. What governance/visibility features are you missing today?

reddit.com
u/MarzipanKlutzy9909 — 1 day ago