A deterministic alternative to embedding-based repo understanding
Hey everyone, I'm Avi a CS student at FHNW in Switzerland. I’ve been a bit frustrated with how AI coding tools handle larger codebases. Most of them rely on embeddings + prompting, which is cool for fuzzy stuff, but sometimes feels inconsistent, hard to reason about, and probably token-heavy. So I w
inspiringsource
Hey everyone, I'm Avi a CS student at FHNW in Switzerland.
I’ve been a bit frustrated with how AI coding tools handle larger codebases. Most of them rely on embeddings + prompting, which is cool for fuzzy stuff, but sometimes feels inconsistent, hard to reason about, and probably token-heavy.
So I wanted to try something more “boring” and predictable.
I built a small prototype called ai-context-map. It uses static analysis to build a structural graph of a repo:
- files
- imports / dependencies
- some basic symbols (mostly Python for now)
The idea is to precompute a map of the repo so an AI (or even a human) doesn’t have to rediscover structure every time.
No ML, no embeddings, no API calls. Just parsing + graph stuff.
It outputs something like a .ai/context.yaml file. Very simplified example:
entry_points:
- path: src/main.py
core_modules:
- src/services/auth.py
task_routes:
api_change:
- src/api/routes.py
- src/services/auth.py
anchors:
- symbol: login_user
file: src/services/auth.py
line: 42
What I'm trying to figure out is basically if this direction even makes sense.
- Where does a purely static / graph-based approach fall apart compared to embeddings?
- Are there tools doing something similar already that I should look into?
- If you work with larger repos: would something deterministic like this actually help, or is vector search + big context already “good enough”?
One thing I'm curious about:
Could something like this reduce how many files an AI needs to look at, and therefore reduce token usage?
Repo:
https://github.com/inspiringsource/ai-context-map
Would really appreciate feedback (also “this is useless” is fine)
Found this useful? Share it!
Read the Full Story
Continue reading on Dev.to
Related Stories
Majority Element
about 2 hours ago
Building a SQL Tokenizer and Formatter From Scratch — Supporting 6 Dialects
about 2 hours ago
Markdown Knowledge Graph for Humans and Agents
about 2 hours ago

Moving Beyond Disk: How Redis Supercharges Your App Performance
about 2 hours ago