r/KnowledgeGraph

Ebbinggaus is insufficient according to April 2026 research

This research paper April 2026 specifically calls out Ebbinghaus as insufficient and I completely agree.

https://arxiv.org/pdf/2604.11364

so i drafted a proposal specification to address the decay rate/promotion layers in an N-arity fashion in a declarative way down to the property level.

i am looking for community feedback because this could potentially allow rapid experimentation with various decay policies and memory management models.

https://github.com/orneryd/NornicDB/issues/100

reddit.com
u/Dense_Gate_5193 — 2 days ago

How I turned three philosophy books into a 1,200-document knowledge graph

Marcus Aurelius says virtue is acting according to nature and reason, serving the common good as naturally as the eye sees. Machiavelli says a prince who acts entirely virtuously will be ruined among so much evil. Nietzsche warns against becoming enslaved to one's own virtues, noting that every virtue inclines toward stupidity.

Same word. Three completely different meanings across seventeen centuries. I wanted to see how many concepts work like this — where the surface agreement hides a deep disagreement — so I built a knowledge graph connecting Meditations (170 AD), The Prince (1513), and Beyond Good and Evil (1886).

The result: Seventeen Centuries — 838 text fragments, 340+ concept files, and category documents that let you trace how ideas evolved across time. The first article built from the graph is Virtue across seventeen centuries, which follows the concept from Stoic duty through political pragmatism to Nietzsche's genealogical critique.

Why a graph, not a database

I needed a structure where the same concept could belong to multiple contexts simultaneously. Virtue belongs under the Stoic worldview and under Machiavelli's political theory and under Nietzsche's critique of morality. Folders force single placement. A database would work but then I lose the thing I actually use — being able to open a file, read it, edit it, link from it.

IWE uses inclusion links — a markdown link on its own line defines a parent-child relationship. A document can have multiple parents. The entire graph is plain markdown files in a flat directory. No database, no special format. I edit them in my text editor, query them from the CLI, and an AI agent can read the same files.

The five-stage pipeline

Stage 1 — Fragment extraction. Parsers for Standard Ebooks XHTML split each book into atomic markdown files — one per aphorism, passage, or chapter. Nietzsche yielded 296 fragments, Marcus Aurelius 515, Machiavelli 27.

# 146

He who fights with monsters should be careful lest he thereby
become a monster. And if thou gaze long into an abyss, the abyss
will also gaze into thee.

Stage 2 — Entity extraction. An LLM read each fragment and identified 3–7 significant entities: philosophical concepts, historical figures, themes. Each entity got its own file. Fragment text was updated with inline links so the graph forms through the content itself:

...life itself is [Will to Power](will-to-power.md);
[self-preservation](self-preservation.md) is only one...

Stage 3 — Flattening and merging. Each book started in its own directory with its own virtue.md, soul.md, plato.md. This stage moved everything into a single flat directory and merged overlapping concepts. Ten concepts appeared in multiple books — virtue, soul, Plato, Socrates, truth, nature, gods, Epicurus, cruelty, free will. These became the most valuable documents in the graph because they're where the real contrasts live.

Stage 4 — Categories. With 340+ concept files floating in a flat directory, I needed entry points. Categories like philosophers, virtues, power-dynamics, and moral-systems emerged from the content. Each is a document with inclusion links to its members — and because IWE supports multiple parents, Socrates belongs to both philosophers and ancient-cultures without duplication.

Stage 5 — Summaries. An LLM analyzed the referenced fragments for each merged concept and wrote comparative summaries. This turned simple backlink indexes into the comparative analysis that makes the graph worth reading — and worth writing articles from.

Why this structure pays off

The graph is queryable from the CLI:

iwe retrieve -k virtue --depth 2    # virtue + linked fragments
iwe find --refs-to will-to-power    # everything referencing will-to-power
iwe tree -k bge                     # Beyond Good and Evil as a tree

retrieve --depth 2 pulls a concept, its backlinks to fragments, and the fragment content in one call. That's how the virtue article was written — retrieve the concept, read the fragments side by side, write the analysis. An AI agent uses the same commands and the same files.

The most surprising result was how much structure emerged from just inclusion links. No tags, no folders, no metadata beyond the links themselves. The graph has clear clusters around each book, bridges through shared concepts, and category entry points — all from markdown files linking to each other.

Browse the graph: https://iwe.pub/seventeen-centuries/ GitHub: https://github.com/iwe-org/seventeen-centuries IWE: https://github.com/iwe-org/iwe

reddit.com
u/gimalay — 4 days ago
▲ 24 r/KnowledgeGraph+3 crossposts

I built a self-organizing Long-Term Knowledge Graph (LTKG) that compresses dense clusters into single interface nodes — here’s what it actually looks like

LTKG Viewer - Trinity Engine Raven

I've been working on a cognitive architecture called Trinity Engine — a dynamic Long-Term Knowledge Graph that doesn't just store information, it actively rewires and compresses itself over time.

Instead of growing endlessly in breadth, it uses hierarchical semantic compression: dense clusters of related concepts (like the left side of this image) get collapsed into stable interface nodes, which then tether into cleaner execution chains.

Here's a clear example from the LTKG visualizer:

[Image: the screenshot you provided]

What you're seeing:

  • Left side = a dense, interconnected pentagram-style cluster (high local connectivity)
  • The glowing interface nodes act as single-point summaries / bottlenecks
  • Right side = a clean linear chain where the compressed knowledge flows into procedural execution

This pattern repeats recursively across abstraction levels. The system maintains a roughly 10:1 compression ratio per level while preserving semantic coherence through these interface nodes.

Key behaviors I've observed:

  • The graph gets denser with use, not necessarily bigger
  • "Interface node integrity" has become one of the most important failure modes (if one corrupts, the whole tethered chain can drift)
  • The architecture scales through depth (abstraction layers) rather than raw node count — what I call the "Mandelbrot Ceiling"

I'm currently evolving it further by driving the three core layers (SEND / SYNTH / PRIME) with dedicated agentic bots and adding a closed-loop reinforcement system using real-world prediction tasks + resource constraints.

Would love to hear from the knowledge graph community:

  • Have you seen similar hierarchical compression patterns in your own graphs?
  • Any good techniques for protecting interface node stability at scale?
  • Thoughts on measuring "semantic compression quality" vs traditional graph metrics (density, centrality, etc.)?

Happy to share more details or other visualizations if there's interest.

u/Grouchy_Spray_3564 — 5 days ago

Self-Maintaining Knowledge Graphs. Stupid or the Future of RDM?

Hi,

I am a rookie to the ontology and KG space. After a long time in the AI startup world, I recently started a PhD in AI-assisted RDM.

I have worked quite a bit on AI-maintained expert systems in the free market, developed for agentic workflow software, and a long and painful time in large-scale AI-driven datarization and surrogates of the WTG industry.

Full disclaimer. I am aware that I am quite wet behind the ears in the KG/ontology field; thus, some of my ideas might sound fantastic to me but ridiculous to someone who has tripped over many of the stones in that space already.

I am looking for a reality check from some !experienced! people here.

Here goes: I am investigating agentically maintained and updated temporal ecosystem KGs.

What that means (to me) is that whenever we want to describe an ecosystem (e.g. the compound material manufacturing science output of a particular institute with hundreds of researchers), we choose artifacts from that ecosystem that help us derive a model that's informed enough to answer the questions we might have.

So, e.g. if the ecosystem we aim to model in our KG is meant to answer questions such as: "Who, at what department, has made a software package that is meant for task x? When did they do it? Are they still at the institute? And is the package maintained during this quarter? How was it funded? (Before you worry about the task X part, we are currently working on taxonomic task ontologies to derive machine-readable scopes and JTBD from process descriptions in papers and docs.)

This could just be one of many questions. (The type of questions and info the KG should inform about are informed by strategic institute goals such as reducing redundancies, discovering abandoned projects or synergies, and are based on needs and knowledge bottlenecks in a specific domain.)

So what we need to describe are ontologies around people, articles, data, software, organizations, grants, etc. .... and their connecting properties.)

My “currently naive” goal is to see how far we can drive AI(LLM)-orchestrated “living” KGs tied to the information systems we have at the institute using the following steps.

  1. Dummy-describe the artifacts and their relationships of the ecosystem that would be needed to answer sets of questions aligned with the needs of the people that will use it.
  2. Map the outcome to existing ontologies as well as possible, bridging fuzzy connections between ontologies (that's something I already see as an almost philosophical, goliath task).
  3. Once we have a “good enough” ontology, we engineer logical constraints (e.g. SHACL).
  4. Then I will define the information endpoints that will act as information wells to instantiate classes from the ontology (e.g., paper, software, and data repositories inside the institute, with all possible properties).
  5. Inside the KG Pipeline, I will now have transformer-orchestrated agents that harvest from said endpoint on defined intervals or, based on webhooks, instantiate classes inside the KG, decide what is new, or an iteration/version jump of an existing instance, redundant, ...etc.
  6. The goal is to basically have a self-versioning KG that functions on a small, well-defined scope and acts as a continuous time capsule/active status harvester for our domain.
  7. People ontologies are informed by HR software and registries, papers by our in-house pub API, software and data by our on-premise repositories, and so on, but the ontology stays fixed and enforced. Updates to the ontology are a conscious and informed decision.
    ---
    (All this is extremely dumbed down, of course; I am aware of the work concerning the ontological description and nuances of the pipeline. Most of my time is currently devoted to prototyping and researching inside these problem spaces.

The goal of all of this is to alleviate the current pains of increasing redundant development and research efforts and allow for faster connection of people with synergetic output, automatic reporting, or human language querying the KG.

I don't want you to solve this for me. I'll do that myself as far as possible. :D I am just here to get some…

"Man, you haven't even scratched the surface of all the problems involved in this”

… comments.

I definitely have the skills to tackle all this. However, a few ontology veterans at conferences and some younger non-AI researchers inside the RDM field have given me the message that this is naive thinking. They have occasionally even laughed at the concept when I explained it. But, the thing is, I have seen similar things work in small, well-defined scopes, and a working prototype based on only a few classes has given me at least a slight POC.

The biggest problems I see coming towards me currently are:

- Data is very noisy (or, opposite, - lack of information), and the way people currently dump their research output, without docs or metadata, etc., is a nightmare.
- Bad info sources result in garbage-graphs.
- There can be multiple sources of truth with different truths, that might all be incorrect or outdated.
- Some ontologies can be difficult to bridge.
- Definition and distinction tasks can enter the realm of philosophical debate.

I have heard everything from...

"This already exists and is a well-proven concept", or "And what is the use of this?", to "This is world-ontology nonsense."

I know, this is a massive post, and I don't think I have covered 1% of my mental workbench, but I would be grateful for some diverse perspectives, ideas about problems I don't see, or pointers at fellow researchers or resources that can inform my research. I am currently in the "don't you see why this is the way" phase, while I often hear, "Don't you see why it's not?"

reddit.com
u/Beneficial_Ebb_1210 — 7 days ago
▲ 2 r/KnowledgeGraph+1 crossposts

Evolving Recipes

I always thought that recipes should be manageable, versionable, translateable and forkable. There was no such tool out there. So first i built an open source culinary knowledge graph that would allow just that (plus plenty other cool things). And second i built a reference implementation for recipe management. The platform is intended to be forever free and to empower home cooks, culinary creators and chefs alike to really thrive, share and innovate like never before. If you love cooking, give it a spin, I'm sure you'll love it! Looking forward to forking your recipes!

amanah.food

Oh, and by the way: 100% Lovable built (except for the UMF schema design).

project-amanah.lovable.app

reddit.com
u/orgoca — 4 days ago

Understanding Knowledge Graphs for AI Agents

A knowledge graph is a structured representation of information that connects entities, relationships, and attributes into a unified semantic model. Unlike traditional search systems that rely on isolated documents or keyword matching, a knowledge graph models how data points relate to one another across systems. Understanding them in detail will be very useful for building agents that can reason and act.

https://www.searchblox.com/what-is-a-knowledge-graph

u/searchblox_searchai — 6 days ago

How about running an LLM inside the graph?

i just found this sub and i’m really excited to share it with you guys because it changes the entire dynamics of memory for LLMs. UC Louvain benchmarked it apples to apples against neo4j for cyber-physical automata learning, and it performed 2.2x faster than neo4j for their experimentation cycle. sub-ms writes, hnsw search, and a whole agentic plugin system that performs in-memory graph-rag with the LLM running inside embedded llama.cpp.

https://github.com/orneryd/NornicDB/blob/main/docs/architecture/README.md

590+ stars, MIT licensed. it’s already deployed in production at a fortune 5 company where i work. i got really lucky to be able to develop this OSS and share it.

i’m not asking for anything from anyone other than if you’re interested try it out, if you like it, im grateful!

https://github.com/orneryd/NornicDB/releases/tag/v1.0.42

u/Dense_Gate_5193 — 5 days ago