u/Wise_Reflection_8340

lazydiff — a terminal-native diff reviewer with semantic diffs, persistent notes, and 60fps rendering
▲ 38 r/rust

lazydiff — a terminal-native diff reviewer with semantic diffs, persistent notes, and 60fps rendering

When I need to review a lot of ai generated code or code in general, I either open a browser tab and lose my context, or pipe it through git diff and scroll through a wall of red and green that forgets everything the moment I close it. No way to leave notes, no way to jump between files, no way to come back later and pick up where I left off.

So I built lazydiff. The core idea was simple, a diff reviewer that lives in the terminal, remembers state, and actually understands code structure.

The first decision was rendering. I went with ratatui and virtualized scrolling — only the visible rows get drawn each frame. This matters because agent-generated diffs can be massive. The benchmark fixture I test against is an 11k-line Node.js PR diff, and it renders at 60fps with sub-2ms frame times. I didn't want to build something that felt sluggish on real-world diffs.

For syntax highlighting I use tree-sitter(a thing I have been loving for so long now), but the tricky part with diffs is that deleted code needs to be highlighted in its original language context, not just painted red. So lazydiff reconstructs both sides of the file independently and maps highlights back through the diff. Inline diffs tokenize each changed line pair and run LCS to show exactly which words changed, you immediately see the meaningful difference without scanning the whole line.

The part I'm most excited about is semantic diffs. lazydiff uses https://github.com/Ataraxy-Labs/sem, which I open-sourced separately and got a lot of love from the Rust community. Instead of just showing line by line level diffs, it parses changes into semantically meaningful entity graphs for functions added, methods modified, classes moved. You see the structure of your changes and how they connect to each other. This is the same engine behind https://github.com/Ataraxy-Labs/weave, the semantic merge driver I built

The agent workflow is what motivated the whole project. You can leave threaded comments anchored to exact lines questions, instructions, notes and review quite fast, which was the utmost desire of the community. Agents also read them via lazydiff agent list and reply via CLI. The whole review session persists to SQLite locally, so you can close the terminal, come back the next day, and everything is exactly where you left it.

License:MIT Licensed

Open Source Repo: https://github.com/Ataraxy-Labs/lazydiff

u/Wise_Reflection_8340 — 9 hours ago

lazydiff — a terminal-native diff reviewer with semantic diffs, persistent notes, and 60fps rendering

Most code review tools are either a browser tab that pulls you out of your terminal or a pager that dumps colored text and forgets everything when you close it. I wanted something that stays in the terminal, remembers where I was, and actually understands what changed.

lazydiff is a keyboard-driven diff reviewer built in Rust with ratatui.

Some highlights:
- Renders 10k+ line diffs at 60fps with sub-2ms frame times — virtualized scrolling, only viewport rows hit the buffer
- Tree-sitter syntax highlighting that reconstructs both sides of the diff independently so deleted code highlights correctly in its original language
- Inline word-level diffs using LCS on tokenized line pairs — highlights the exact tokens that changed, not the whole line
- Split and unified view, fuzzy file navigation powered by nucleo, vim keybindings
- Semantic diffs powered by https://github.com/Ataraxy-Labs/sem — parses changes into entity graphs of functions, classes, and methods instead of just showing lines
- Threaded comments anchored to exact lines — leave notes and instructions for your coding agents, they read and reply via CLI
- Everything persists to SQLite locally, close the terminal, come back tomorrow, pick up where you left off

Built for the workflow where you're already in the terminal working with coding agents and don't want to context-switch to a browser to review what they wrote.

Dual-licensed MIT/Apache-2.0.

Open Source Repo: https://github.com/Ataraxy-Labs/lazydiff

u/Wise_Reflection_8340 — 10 hours ago
▲ 8 r/commandline+1 crossposts

lazydiff — a terminal-native diff reviewer with semantic diffs, persistent notes, and 60fps rendering

Most code review tools are either a browser tab that pulls you out of your terminal or a pager that dumps colored text and forgets everything when you close it. I wanted something that stays in the terminal, remembers where I was, and actually understands what changed.

lazydiff is a keyboard-driven diff reviewer built in Rust with ratatui. Some highlights:

- Renders 10k+ line diffs at 60fps with sub-2ms frame times, virtualized scrolling, only viewport rows hit the buffer
- Tree-sitter syntax highlighting that reconstructs both sides of the diff independently so deleted code highlights correctly in its original language
- Inline word-level diffs using LCS on tokenized line pairs, highlights the exact tokens that changed, not the whole line
- Split and unified view, fuzzy file navigation powered by nucleo, vim keybindings
- Semantic diffs powered by https://github.com/Ataraxy-Labs/sem, parses changes into entity graphs of functions, classes, and methods instead of just showing lines
- Threaded comments anchored to exact lines, leave notes and instructions for your coding agents, they read and reply via CLI
- Everything persists to SQLite locally, close the terminal, come back tomorrow, pick up where you left off

Built for the workflow where you're already in the terminal working with coding agents and don't want to context-switch to a browser to review what they wrote.

User Friendly License: Dual-licensed MIT/Apache-2.0.

Open Source Repo: https://github.com/Ataraxy-Labs/lazydiff

u/Wise_Reflection_8340 — 9 hours ago
▲ 195 r/rust

Weave - Structural merging what I learned shifting from git's line based merge to tree sitter entity matching

I've been working on a git merge driver that operates on semantic entities (functions, classes, methods) instead of lines. Wanted to share some things I learned along the way since structural merging is an underexplored area.

Why line based merging have some fundamental issues?
When two branches add separate functions to the same region of a file, git sees overlapping line ranges and declares a conflict, even though the changes are completely independent. I call these false conflicts.They've always been an issue, but they become a real bottleneck when multiple agents or developers are editing the same files concurrently. Also I can never be in a place to argue that git's not good enough, its one of the greatest pieces of software.

The core idea: merge at the entity level
Parse all three versions (base, ours, theirs) with tree-sitter into entities. Match entities across versions by identity (name + type + scope). If different entities were touched, auto-merge. If the same entity was modified on both sides, attempt intra-entity resolution, and only then flag a real conflict.

The interesting part is separating interstitial content, imports, whitespace, comments between functions, from the entities themselves. Getting reconstruction right so the merged file doesn't look mangled took more iteration than the actual merge logic.

Things that surprised me
- Identity matching is harder than it sounds. Name + type + scope works for ~95% of cases. But anonymous closures, multiple trait impl blocks for the same type, and macro-generated items all make identity ambiguous. I ended up using content hashing as a tiebreaker when structural identity is insufficient.
- Tree sitter is good enough. I considered language-specific parsers (syn for Rust, swc for TypeScript) but tree-sitter's error recovery and uniform AST across 28 languages made it the practical choice. It doesn't need valid code to produce a usable parse tree, which matters because merge inputs are often mid-refactor.
- Fallback is non negotiable, for unsupported file types, files >1MB, or anything binary, I fall back to git's default merge. Users need to trust that installing a custom merge driver won't make things worse. This was a hard design constraint from day one, and I am still trying to improve
- File reconstruction is the real problem. Merging entities is conceptually clean. Putting the file back together, preserving import ordering, blank line conventions, trailing newlines, comment placement, is where all the edge cases live. I spent more time on better reconstruction than on the merge algorithm itself.

How Mergiraf approaches the same problem differently
Mergiraf is the closest prior art here and it's worth understanding because the two tools make fundamentally different architectural bets. Mergiraf works at the AST node level. It parses all three versions with tree-sitter, then runs the https://mergiraf.org/architecture.html algorithm, a two-phase matcher that first finds isomorphic subtrees top-down, then infers more matches bottom-up by looking at ancestors of already matched nodes. From there it encodes the trees as PCS (Parent-Child-Successor) triples and merges the triple sets, resolving inconsistencies node by node.

On the other hand Weave works at the entity level, It doesn't try to match every AST node, it extracts coarse-grained units (functions, classes, methods) and matches them by identity. The merge operates on these larger chunks rather than individual tree nodes.

In practice what this means:
- Mergiraf is more fine-grained, It can theoretically resolve conflicts within a single expression because it tracks individual AST nodes. The tradeoff is that GumTree matching is computationally expensive, which is why Mergiraf runs a line based merge first and only invokes the structured algorithm when conflicts exist.
- Weave is coarser but faster, matching by entity identity (name + type + scope) is cheaper than computing tree edit distances. The tradeoff is that if two branches modify the interior of the same function differently, weave can't resolve it structurally, it falls back to line-level for that entity.
- Reconstruction differs significantly. Mergiraf reconstructs from merged AST node triples, which preserves fine-grained structure but has to solve whitespace recovery (whitespace isn't in the AST). Weave reconstructs from entity blocks and interstitial regions, which naturally preserves formatting but is less precise at the sub-entity level.

The whole thing is built in Rust on top of https://github.com/Ataraxy-Labs/sem for tree-sitter entity extraction, and I got a lot of love from rust community for sem. That's why I wanted support for this work as well.

Again standard Dual-licensed Apache-2.0 / MIT like sem.
Repo: https://github.com/Ataraxy-Labs/weave

u/Wise_Reflection_8340 — 3 days ago

semantic diff tool that shows what changed at a higher level instead of raw lines

I got tired of opening pull requests and scrolling through n-line diffs trying to figure out what actually changed. Regular git diff shows you line-by-line changes, but what I usually want to know first is which functions were modified? Was anything added or deleted? What's the shape of this change?

So I built a cli tool called sem. It parses your code using tree-sitter (supports 26 languages), extracts entities like functions, classes, and bindings, and diffs at that level. Instead of "lines x-y changed" you get "function processPayment was modified."

Turns out this also helps a lot when feeding diffs to AI coding agents, we measured 2.3x better accuracy on code comprehension tasks compared to raw diffs, because the model can focus on structured changes instead of parsing noise.

It's a single Rust binary, works on top of git, open source (MIT/Apache-2.0):
https://github.com/Ataraxy-Labs/sem

Curious what other people's workflows look like for reviewing large diffs, do you just read them top to bottom, or do you have some other way of getting the big picture first? And for anyone using AI tools for code review, what kind of context are you feeding them?

u/Wise_Reflection_8340 — 11 days ago
▲ 37 r/software+1 crossposts

sem — semantic diff engine that understands code structure

Instead of line-level diffs, sem extracts functions, classes, bindings, and other entities from your code using tree-sitter, then diffs at the entity level. So instead of "lines x-y changed," you get "function processPayment was modified" or "binding buildInputs was added."

For humans, it gives you an immediate high-level overview of what actually changed — you can glance at the output and know which functions were added, which classes were modified, and what got deleted, without scrolling through hundreds of lines of raw diff. Great for code review when you want to understand the shape of a change before diving into the details.

For LLMs, the gains are even more measurable. We ran attention analysis on models (GLM-4 and Qwen) and benchmarked agent accuracy with Claude Sonnet:

- 2.3x agent accuracy on code change comprehension tasks
- Attention entropy drops significantly — models concentrate on the actual changes instead of scattering across noise in raw diffs
- token reduction — entity-level context packs more signal into fewer tokens

Raw diffs are optimized for human line-by-line reading. LLMs don't read that way — they attend over the full context window, so structured entity-level input lets them focus attention where it matters.

Other details:
- 26 languages (just added Nix this week)
- Works on top of git
- Also ships as an MCP server so coding agents can consume structured diffs directly
- Plain Rust binary, no runtime dependencies
- brew install sem-cli

https://github.com/Ataraxy-Labs/sem

u/Wise_Reflection_8340 — 11 days ago
▲ 2 r/PostgreSQL+1 crossposts

I've been working on a tool that uses tree-sitter grammars to extract structural entities (functions, classes, methods) from source code, then builds a cross-file dependency graph by resolving references between them.

The core problem: traditional diff tools compare lines, but the meaningful unit of change in code is an entity. When you rename a function, move a method, or reformat a file, line-level diff produces noise. Entity-level diff tells you "this function was modified, this one was added, this one moved."

The interesting technical bits:
- Each language gets a config that maps AST node types to entity types (e.g. function_definition in Python, function_item in Rust, method_declaration in Java). Currently supports 25+ languages through tree-sitter.
- Scope resolution walks the AST to resolve which entity references which other entity, handling class scopes, impl blocks, function parameters, and assignment-based type tracking. This produces a directed dependency graph across files.
- Diffing works by matching entities between two versions by name + type, then comparing their structural hashes (hash of the normalized AST subtree, ignoring whitespace and comments). Moved or renamed entities get detected through content similarity.
- The dependency graph enables transitive impact analysis: "if this function changes, what's the full set of downstream entities that depend on it?"

One challenge: tree-sitter grammars are syntactic, not semantic. You don't get type information, so resolving x.foo() to the right method requires heuristics (parameter type annotations, assignment tracking, class scope inference). It gets you maybe 90% accuracy without a full type checker, which turns out to be enough for diffing and impact analysis.

If someone wants to try it, the tool is called sem, written in Rust: https://github.com/ataraxy-labs/sem

Curious if anyone here has worked on similar entity extraction from ASTs, or has thoughts on better approaches to cross-language reference resolution without full semantic analysis.

u/Wise_Reflection_8340 — 19 days ago
▲ 65 r/emacs+2 crossposts

I've been working on a tool that uses tree-sitter grammars to extract structural entities (functions, classes, methods) from source code, then builds a cross-file dependency graph by resolving references between them.

The core problem: traditional diff tools compare lines, but the meaningful unit of change in code is an entity. When you rename a function, move a method, or reformat a file, line-level diff produces noise. Entity-level diff tells you "this function was modified, this one was added, this one moved."

The interesting technical bits:
- Each language gets a config that maps AST node types to entity types (e.g. function_definition in Python, function_item in Rust, method_declaration in Java). Currently supports 25+ languages through tree-sitter.
- Scope resolution walks the AST to resolve which entity references which other entity, handling class scopes, impl blocks, function parameters, and assignment-based type tracking. This produces a directed dependency graph across files.
- Diffing works by matching entities between two versions by name + type, then comparing their structural hashes (hash of the normalized AST subtree, ignoring whitespace and comments). Moved or renamed entities get detected through content similarity.
- The dependency graph enables transitive impact analysis: "if this function changes, what's the full set of downstream entities that depend on it?"

One challenge: tree-sitter grammars are syntactic, not semantic. You don't get type information, so resolving x.foo() to the right method requires heuristics (parameter type annotations, assignment tracking, class scope inference). It gets you maybe 90% accuracy without a full type checker, which turns out to be enough for diffing and impact analysis.

The tool is called sem, written in Rust: https://github.com/ataraxy-labs/sem

Curious if anyone here has worked on similar entity extraction from ASTs, or has thoughts on better approaches to cross-language reference resolution without full semantic analysis.

u/Wise_Reflection_8340 — 19 days ago

git's merge algorithm works on lines. It doesn't know what a function is. So when two branches modify different parts of the same function, or one branch moves a function while another edits it, you get a conflict that isn't really a conflict.

I built weave to fix this. It uses tree-sitter to parse code into entities (functions, classes, methods), then merges at that level instead of lines.

On a benchmark of 31 real-world merge scenarios across Rust, Python, TypeScript, Go, Java, and C: weave resolved 100% cleanly. git resolved 48%. weave handled 16 merges that git couldn't.

I am still working on this research and there are things that are definitely under development but would love any feedback possible from the community.

How it works behind the scene:

- Parses both sides + base into entities using tree-sitter
- Matches entities by name across versions
- If two branches changed different entities in the same file, clean merge
- If both changed the same entity, falls back to line-level within that entity only
- Safety net: compares the merged output against both sides to verify no lines were lost

Install:

brew install ataraxy-labs/weave

Works with rebase too since git calls the merge driver for each replayed commit.

Written in rust.

Open Source on GitHub: https://github.com/ataraxy-labs/weave

To understand this properly, the website might help you: https://ataraxy-labs.github.io/weave/

u/Wise_Reflection_8340 — 22 days ago

I kept running into the same problem during code review: a function gets moved, renamed, or reformatted, and git diff shows 200 lines of red and green that mean nothing.

So I built sem. It uses tree-sitter to parse code into actual functions, classes, and methods, then diffs at that level instead of lines.

What it does differently:

- Move a function to another file? sem shows "moved," not "deleted + added"
- Rename a variable? sem shows just the rename, not every line that mentions it
- Reformat code? sem skips it entirely

It also builds a dependency graph across files. So you can ask "if I change this function, what breaks?" and get a real answer. No LLM, no guessing, just graph traversal.

30+ languages supported, written in Rust, runs in milliseconds.

GitHub: https://github.com/ataraxy-labs/sem

Would love feedback from anyone who's dealt with the same frustration reviewing diffs.

u/Wise_Reflection_8340 — 23 days ago