Skip to content

code-review-graph: Using a Knowledge Graph to Cut AI Code Review Token Usage by 8x

Apr 2, 2026 1 min
TL;DR code-review-graph uses Tree-sitter to parse your codebase and build a persistent knowledge graph, tracks the blast radius of changes, and feeds only truly relevant context to the AI — claiming an average 8.2x reduction in token usage.

🌏 中文版

What’s the biggest waste in AI code review? Feeding the entire repo to an LLM every single time. You change one function, and the AI has to read hundreds of files just to give you feedback. That’s exactly what code-review-graph sets out to fix — by building a structural map of your codebase so that, when a change happens, only the genuinely affected parts get extracted.

Core Concept: Code as a Graph

The traditional AI code review flow is: “what files changed → dump them all into the LLM.” code-review-graph adds a layer in between: it first uses Tree-sitter to parse the entire codebase into an AST, extracting functions, classes, and imports as nodes, and call relationships, inheritance, and test coverage as edges. This forms a NetworkX graph that gets persisted to SQLite.

The graph is persistent. The first build requires a full scan, but after that, Watchdog monitors for file changes and only re-parses modified files — incremental updates are claimed to complete in under 2 seconds.

The key difference from plain text search (grep) or simple AST analysis is that the graph can trace cross-file dependency chains. A calls B, B inherits C, C’s tests live in D — change C and you immediately know A, B, and D are all potentially affected.

Architecture

Code change


Tree-sitter AST parsing (19 languages supported)


Node extraction (functions, classes, imports)
Edge extraction (calls, inheritance, test coverage)


NetworkX graph ──→ SQLite persistence


Watchdog monitoring ──→ Incremental updates (< 2s)


Change impact analysis (Blast Radius)


MCP tools ──→ Claude Code / Cursor / Windsurf / Zed

Key modules:

ModulePurpose
parser.pyTree-sitter multi-language parsing; extracts AST nodes and edges
graph.pyNetworkX graph data structure
incremental.pyIncremental update engine
changes.pyBlast radius calculation and risk scoring
communities.pyGraph community clustering (identifies related code groups)
flows.pyExecution flow tracing, ranked by risk level
search.pySemantic search (optional embeddings)
visualization.pyD3.js interactive visualization

Tech Stack

  • Python 3.10+ (82.5%) + TypeScript (17.3%, VSCode extension)
  • Tree-sitter + tree-sitter-language-pack: the de facto standard for multi-language AST parsing
  • NetworkX: Python graph computation, with optional igraph for community detection
  • SQLite: persistent graph storage with zero deployment overhead
  • FastMCP: MCP protocol implementation, enabling AI tools to call graph functions directly
  • Watchdog: filesystem monitoring
  • Optional: sentence-transformers / Google Generative AI / Ollama for embedding-based search

Choosing SQLite + NetworkX over something like Neo4j is clearly a deliberate decision to lower the deployment bar. For most codebases, this trade-off is sensible — the scale of a code graph rarely justifies a dedicated graph database.

Performance Numbers

Official benchmarks across 6 real-world repos:

  • Average 8.2x token reduction: the amount of context the AI receives drops dramatically
  • 100% recall: every genuinely affected file gets detected
  • F1 = 0.54: precision is modest; the system tends to over-predict

The combination of 100% recall and low F1 means: it won’t miss files that matter, but it will pull in some irrelevant ones. For code review, that trade-off is acceptable — missing an important file is far more costly than reading a few extra ones.

MCP Integration

This is the killer feature. Through the MCP protocol, Claude Code, Cursor, Windsurf, Zed, Continue, and OpenCode can all call graph tools directly. The AI no longer has to decide which files to read on its own — the graph tells it outright: “this change affects these functions and files.”

The project also ships a VSCode extension that lets you view the graph visualization and impact analysis directly inside the IDE.

When It Makes Sense (and When It Doesn’t)

Good fit:

  • Medium-to-large codebases with complex cross-file dependencies
  • Teams doing frequent AI-assisted code reviews
  • Scenarios where controlling LLM token costs matters

Poor fit:

  • Small projects or single-file changes — the graph maintenance overhead can exceed the cost of just reading the files directly
  • Situations requiring high-precision impact analysis (F1 of only 0.54)
  • Teams that don’t use AI code review tools — without an AI consumer, the graph has nowhere to add value

Overall

code-review-graph addresses a genuine pain point: wasted context in AI code review. By combining Tree-sitter with a knowledge graph, it shrinks “what the AI sees” from “the entire repo” down to “the parts that actually matter.” The 8.2x token savings is meaningful at scale.

That said, the precision (F1 0.54) shows there’s still room to improve. Over-prediction means the AI still receives some irrelevant context — it’s just far better than brute-forcing the whole thing. The overhead on small changes is also worth watching — this extra layer isn’t the right fit for every situation.

The core trade-off is clear: you pay an upfront cost to maintain a persistent graph, and you collect savings on every subsequent review. The larger the repo and the more frequent the reviews, the better that deal becomes.

References