
Arkon: turning Claude from a personal chatbot into a managed organizational resource
I've been building Arkon, a Knowledge OS designed to bridge the gap between enterprise knowledge and AI Agents. It's been a journey of learning, iterating, and sometimes, completely rethinking my approach. Today, I want to share some insights into how Arkon works, especially our unique MRP (Map-Reduce-Plan-Refine-Verify) Pipeline and the underlying philosophy of an AI Agent's architecture.
The Problem: AI Agents Need More Than Just a "Brain"
Many companies adopting LLMs face a common challenge: employees use tools like ChatGPT or Claude individually, often pasting confidential documents into public chatbots. This leads to zero organizational control and fragmented knowledge.
If we think of an AI as a "Brain" (the LLM's weights, trained on vast datasets), it's incredibly powerful for general knowledge. But for specific, up-to-date, or internal company information, it lacks a reliable "Knowledge Notebook". Traditional RAG often falls short, providing fragmented chunks without proper context or verifiability.
Arkon's Approach: The AI Agent's "Knowledge Notebook"
Arkon positions itself as the central Knowledge OS for enterprises. It connects to AI Clients (like Claude Desktop) via MCP (Model Context Protocol). This means employees continue using their preferred AI tools, but now, they automatically get the right, secure context based on their role and project.
We manage knowledge in two realms:
- Global Knowledge: Organization-wide documents and wikis, with access scoped by department. A finance person sees finance docs, an engineer sees engineering docs.
- Workspaces: Smaller, membership-gated scopes for projects or cross-functional teams. Your global role doesn't grant access here; only explicit membership does.
The MRP Pipeline: How We Build a Verifiable Wiki-style RAG
This is where Arkon truly differentiates itself. Instead of simple chunking and retrieval, we've developed a robust MRP (Map-Reduce-Plan-Refine-Verify) Pipeline to transform raw documents into a structured, verifiable, wiki-style knowledge base. This pipeline was born out of the painful lessons learned from an earlier, overly "agentic" approach that failed in real-world scenarios.
Here's a quick breakdown:
- Map: Documents are broken down into smaller chunks. LLMs extract entities, concepts, and claims, crucially including absolute_offset (byte position in the original document) for later citation. Results are saved immediately for resilience.
- Reduce: Extracted data is de-duplicated (using exact match, embedding similarity, and LLM reconciliation) and consolidated. This prevents the AI from creating multiple entries for the same entity.
- Plan (Human-in-the-loop): The AI proposes a plan for new or updated wiki pages. This is a critical human intervention point where editors can review, modify, or approve the plan before any changes are committed. An effective AI pipeline isn't one that removes humans, but one that asks humans the right questions at the right time.
- Refine: Dedicated AI writers generate high-quality wiki pages based on pre-assembled evidence. Each claim includes a [^N] footnote linking directly back to its source excerpt.
- Verify: Post-refinement, checks are run for citation accuracy, coverage, and potential conflicts with existing knowledge. These are non-blocking, providing logs for human review.
This multi-stage, resumable pipeline ensures that the knowledge base is not only comprehensive but also accurate, verifiable, and resilient to failures – a crucial difference between a demo and a production-ready system.
Key Takeaways for Building AI Agents:
- Agent loops are not a silver bullet: They are a tool, not the entire architecture. Deterministic code is essential for orchestration.
- State must be persisted: "In-memory agent state" is the enemy of reliability. Every phase's output needs to be saved.
- Human-in-the-loop is a feature: It builds trust and ensures quality, especially for enterprise use cases.
- Citations are non-negotiable: For B2B knowledge, a claim without a source is useless.
Arkon is my attempt to define how an AI Agent truly operates: a Brain (Model Weights), equipped with a Body (Harness & Tools), and supported by a structured, verifiable Knowledge Notebook (Arkon's Wiki-style RAG).
I invite you to explore the codebase, provide feedback, and join the discussion. Your insights are invaluable as we continue to evolve Arkon.
GitHub Repo: https://github.com/nduckmink/arkon
What are your thoughts on this architecture? Have you faced similar challenges building AI Agents or knowledge systems? Let's discuss!