No More Ghost Paths: Taming LLM Hallucinations with Real Codebase Context
We tackled a common AI challenge: Large Language Models hallucinating file paths and architecture. Discover how we grounded our LLM workflow engine with real codebase context using custom template variables and achieved a dramatic leap in accuracy.
Large Language Models (LLMs) are incredible tools, capable of generating code, analyzing systems, and even suggesting architectural changes. But there's a recurring nightmare for anyone using them in a development context: hallucination. Specifically, when an LLM confidently invents file paths, directory structures, or even entire modules that simply don't exist in your codebase. It's like asking for directions and being given a beautifully drawn map to a phantom city.
This was a critical challenge for our workflow engine, which leverages LLMs to assist developers. We needed our AI to be a reliable co-pilot, not a creative fiction writer when it came to our project's anatomy. The goal was clear: ground our LLM prompts in real codebase context, eliminating those frustrating, hallucinated file paths and architectural misinterpretations.
The Problem: When AI Goes Off-Script
Imagine an LLM tasked with suggesting an improvement to a feature, or perhaps even generating a new code extension. Without explicit knowledge of the project's structure, it might suggest adding a file at /src/utils/new-feature-helper.js when your project exclusively uses TypeScript and organizes helpers differently, or reference a non-existent /config/settings.yaml file. This isn't just an annoyance; it leads to unusable outputs, wasted developer time, and a breakdown of trust in the AI's capabilities. Our previous sessions saw path accuracy hovering around a dismal 40-50%. Unacceptable.
The Solution: Grounding with {{claudemd}} and {{fileTree}}
Our approach was to inject explicit, real-time codebase knowledge directly into the LLM's prompt context. We introduced two powerful new template variables into our workflow engine:
{{claudemd}}: This variable dynamically loads the content of aCLAUDE.md(orREADME.mdas a fallback) file from the linked repository. This allows developers to provide high-level architectural overview, setup instructions, or specific "rules of the road" for the AI to follow. Think of it as the project's "manifesto" for the LLM.{{fileTree}}: This variable fetches and formats a markdown code block containing the actual file and directory structure of the linked repository, capped at a reasonable number of entries to prevent context window overload. This is the AI's real-time map of the codebase.
By providing these, we aimed to give the LLM an undeniable source of truth for its structural understanding.
Under the Hood: How We Built It
Integrating these new context anchors required several key modifications to our nyxcore workflow engine:
1. Fetching the Repository Tree
The first step was to reliably get the file structure. We enhanced src/server/services/github-connector.ts with a new fetchRepoTree() function. This function intelligently uses the GitHub Git Trees API, which is efficient for retrieving directory structures. It also includes logic to filter out common noise directories (like .git, node_modules, dist), and crucially, falls back from the main branch to master if main isn't found, ensuring compatibility with older or differently configured repos.
2. Loading Project-Specific Documentation (CLAUDE.md)
Next, we added loadClaudeMdContent() to src/server/services/workflow-engine.ts. This function attempts to read CLAUDE.md first, providing a dedicated space for AI-specific instructions and context. If CLAUDE.md isn't present, it gracefully falls back to README.md, ensuring some level of project overview is always provided.
3. Building the File Tree Content
The loadFileTreeContent() function, also in src/server/services/workflow-engine.ts, is responsible for taking the raw repository tree data and formatting it into a clean, readable markdown code block. We capped this at 500 entries to balance comprehensive context with token limits.
4. Seamless Integration into the Workflow Engine
To make these new variables available, we:
- Extended
ChainContext: AddedclaudeMdContentandfileTreeContentfields to our core context object, making the data accessible throughout the workflow. - Parallel Loading: During workflow startup, both
loadClaudeMdContent()andloadFileTreeContent()now run concurrently usingPromise.all. This ensures maximum efficiency and prevents one slow operation from blocking the other. - Prompt Resolution: Our
resolvePrompt()function was updated to recognize and substitute{{claudemd}}and{{fileTree}}alongside existing variables like{{docs}}and{{consolidations}}.
5. Updating Prompt Templates
Finally, and critically, we updated five core prompt templates in src/lib/constants.ts (extensionAnalyze, extensionPrompt, secRecon, secPrompts, deepPrompt). Beyond just adding the new variables, we explicitly instructed the LLM in the system prompts: "MUST reference real file paths from the provided file tree — never invent or guess paths." This direct instruction reinforces the importance of using the provided context.
The Results: A Leap in Accuracy!
The moment of truth came with an end-to-end test of our Extension Builder workflow. This 3-step process, which previously struggled with path accuracy, completed successfully in 315 seconds. The verification was astounding:
- Step 1: Achieved 97% real paths (28 out of 29 references were accurate)!
- Step 3: Reached 91% real paths (31 out of 34 references were accurate)!
This is a massive improvement from the 40-50% accuracy we observed in previous sessions without grounding. The LLM is now consistently referencing actual file paths, making its generated outputs far more practical and usable.
Lessons Learned (and Some Persistent Pains)
Development always comes with its share of hurdles. Here are a few insights from this session:
npx tsxExecution Context: We hit a snag trying to run quick scripts withnpx tsx -e '...'and top-level await. It turns outtsxexpects a file for top-level await in certain configurations, so writing to a temporary file was necessary. Also, remember that scripts need to run from the project root to properly resolve@prisma/clientand other internal modules.- Pre-existing Tech Debt: We noted a pre-existing TypeScript error in
discussions/[id]/page.tsxthat wasn't ours but good to keep an eye on. It's a reminder that large projects often have background noise. - Ongoing Platform Challenges: Some of the pain points from earlier sessions (around auth, SSE, and Prisma nuances) are still relevant. Building complex systems means consistently navigating intricate interactions.
What's Next?
Our work isn't done. We'll be looking to:
- Expand Grounding: Consider adding
{{claudemd}}and{{fileTree}}to more templates if we identify further hallucination issues downstream. - Document New Variables: Update our
CLAUDE.mddocumentation to officially list these new supported template variables. - Test Alternatives Flow: Ensure our workflow for generating multiple alternatives and selecting one works seamlessly with this new grounding.
This session marks a significant milestone in making our LLM-powered workflow engine truly reliable for developers. By providing the AI with a clear, undeniable map of the codebase, we've taken a huge step towards eliminating "ghost paths" and fostering a more productive, trustworthy AI-assisted development experience.