Grounding Our LLMs: Taming Codebase Hallucinations with Real Context
We've all seen it: an LLM confidently generating code with non-existent file paths. This session, we tackled that head-on by injecting real codebase context directly into our prompts, making our AI assistants far more reliable.
If you've spent any time building tools powered by Large Language Models (LLMs) for code generation or analysis, you've likely encountered the "hallucination problem." It's that moment when your AI assistant, with all the confidence in the world, suggests a file path that simply doesn't exist, invents a directory structure, or describes an architecture that bears no resemblance to your actual codebase. It's frustrating, time-wasting, and a major barrier to reliable AI-assisted development.
This past session, we set out to tackle this head-on for our workflow engine. Our goal was clear: inject real, verifiable codebase context directly into our LLM prompts, eliminating the guesswork and grounding our AI in reality.
The Problem Space: LLMs and the Codebase Conundrum
Imagine asking an LLM to "implement feature X in src/utils/new_helper.ts." Without direct knowledge of your repo's structure, it might invent src/api/helpers/new_helper.ts or even lib/features/x/utils.ts. While impressive in its creativity, this "creativity" quickly becomes a liability when you need actionable, runnable code.
Our workflow engine orchestrates complex code generation and analysis tasks, linking directly to user repositories. For the LLM to be truly useful, it needs to understand:
- Project-level context: What is this repo about? What are its core principles, technologies, and quirks? (Often found in
README.mdor a dedicated project overview). - File system structure: What directories exist? Where are common files located? What's the general layout?
Without this, every prompt becomes a shot in the dark, requiring heavy post-processing and manual correction.
Our Solution: Grounding AI with Real Context
We introduced two new template variables into our workflow engine: {{claudemd}} and {{fileTree}}. These aren't just arbitrary strings; they're dynamic placeholders that our system resolves into concrete, repo-specific context before sending a prompt to the LLM.
{{claudemd}}: The Project's Voice
First, {{claudemd}} is designed to provide high-level project context. When a workflow starts, our system now attempts to load the content of a CLAUDE.md file from the linked repository. If CLAUDE.md isn't found, it gracefully falls back to README.md. This allows developers to explicitly guide the LLM with project-specific instructions, architectural decisions, or even preferred coding styles, without cluttering the main README.
{{fileTree}}: The Blueprint of the Codebase
Second, and perhaps most impactful for hallucination reduction, is {{fileTree}}. This variable gets populated with a markdown-formatted representation of the repository's directory structure. We cap this at 500 entries to manage token budget, ensuring that even large repos can provide meaningful context without overwhelming the LLM.
Implementation Deep Dive
Let's break down how we brought these context anchors to life.
-
Fetching the Repo Tree: At the heart of
{{fileTree}}is the ability to get a complete directory listing. We leveraged GitHub's powerful Git Trees API. This API allows us to fetch the entire recursive tree of a branch in a single call, which is incredibly efficient.typescript// src/server/services/github-connector.ts async function fetchRepoTree(owner: string, repo: string, branch: string): Promise<string[]> { try { const treeResponse = await githubApi.git.getTree({ owner, repo, tree_sha: branch, // Can be branch name or commit SHA recursive: '1', }); // Filter out common noise: node_modules, .git, .next, dist, lock files const filteredPaths = treeResponse.data.tree .filter(item => item.type === 'blob' || item.type === 'tree') // Only files and directories .map(item => item.path) .filter(path => !path.includes('node_modules/') && !path.includes('.git/') && !path.includes('.next/') && !path.includes('dist/') && !path.endsWith('.lock')) .slice(0, 500); // Cap at 500 entries for token budget return filteredPaths; } catch (error) { // Fallback from 'main' to 'master' if 'main' doesn't exist if (error.status === 404 && branch === 'main') { console.warn(`Branch 'main' not found for ${owner}/${repo}, trying 'master'.`); return await fetchRepoTree(owner, repo, 'master'); } throw error; } }This
fetchRepoTreefunction is robust, including a fallback frommaintomasterbranch in case the default doesn't exist – a common scenario in older or migrated repositories. -
Loading Context in Parallel: To ensure our workflow engine remains responsive, both
loadClaudeMdContent()andloadFileTreeContent()now run in parallel usingPromise.allduring workflow startup. This happens alongside other existing context-loading operations, ensuring minimal latency impact.typescript// src/server/services/workflow-engine.ts (conceptual snippet) async function startWorkflow(context: ChainContext) { // ... existing context loading ... const [claudeMd, fileTree] = await Promise.all([ loadClaudeMdContent(context.repoUrl), loadFileTreeContent(context.repoUrl), ]); context.claudeMdContent = claudeMd; context.fileTreeContent = fileTree; // ... rest of workflow initialization ... } -
Extending
ChainContextandresolvePrompt(): Our internalChainContextwas extended withclaudeMdContentandfileTreeContentfields. The coreresolvePrompt()function, responsible for injecting variables into prompt templates, was updated to recognize and substitute{{claudemd}}and{{fileTree}}with their fetched content. -
Crucial: Updating System Prompts: Having the data is one thing; making the LLM use it is another. We updated five critical prompt templates in
src/lib/constants.tsto include these new variables:extensionAnalyzeextensionPromptsecReconsecPromptsdeepPrompt
More importantly, we updated the system prompts for these templates to explicitly instruct the LLM: "MUST reference real file paths from the provided file tree — never invent or guess paths." This explicit directive is key to guiding the model's behavior and enforcing the use of the provided context.
typescript// Example of a modified prompt template (conceptual) const extensionPrompt = (context: ChainContext) => ` You are an expert software engineer tasked with implementing a new feature. Project Context: {{claudemd}} Repository File Tree: {{fileTree}} --- Task: Implement the following feature based on the project context and file tree. MUST reference real file paths from the provided file tree — never invent or guess paths. ... rest of the prompt ... `;
Lessons Learned (and a smooth session!)
This session was remarkably smooth, a testament to the clear problem definition and the power of well-documented APIs like GitHub's Git Trees. We encountered no major issues, which is always a welcome change!
While this specific feature addition was straightforward, it builds on the foundations laid in earlier, more challenging sessions. We've certainly had our share of "pain" dealing with authentication complexities, real-time SSE connections, and tricky Prisma migrations (see session notes 0001-0004 for those sagas!). This session, however, highlighted the value of:
- Proactive Context Provision: Instead of reacting to LLM hallucinations, we're proactively providing the necessary grounding data. This shifts the paradigm from "fix the AI's mistakes" to "enable the AI to be right from the start."
- Robust API Choices: The GitHub Git Trees API proved to be an excellent, efficient choice for fetching comprehensive repository structure.
- Explicit Instructions: Simply providing context isn't enough; explicitly instructing the LLM on how to use that context (e.g., "MUST reference real file paths") is crucial for effective prompt engineering.
Immediate Next Steps
While the implementation is complete and typechecks are clean, the real test begins now:
- Commit and Integrate: Get these changes into our main branch.
- End-to-End Testing: Run our Extension Builder workflow with a linked repository and meticulously verify that generated prompts contain accurate file paths and
CLAUDE.mdcontent. - Token Budget Verification: Crucially, we need to monitor the token usage. The 500-file cap for
{{fileTree}}is a good start, but we'll confirm it doesn't blow the LLM's context window for typical repos. - Expand Template Usage: If testing confirms the effectiveness, we'll consider injecting
{{claudemd}}and{{fileTree}}into more downstream templates (e.g.,extensionFeatures,extensionExtend,secRemediation) where hallucination might still be an issue. - Documentation: Update our internal
CLAUDE.mddocumentation to list these new supported template variables, empowering developers to leverage them fully.
This feature marks a significant step towards making our LLM-powered developer tools more reliable, efficient, and genuinely helpful. By grounding our AI in the undeniable reality of the codebase, we're taking a big leap towards taming those pesky hallucinations for good.
{"thingsDone":["Added `fetchRepoTree()` to `github-connector.ts` using GitHub Git Trees API, with filtering and branch fallback.","Implemented `loadClaudeMdContent()` and `loadFileTreeContent()` in `workflow-engine.ts`.","Extended `ChainContext` with `claudeMdContent` and `fileTreeContent` fields.","Added `{{claudemd}}` and `{{fileTree}}` resolution in `resolvePrompt()` function.","Ensured both loaders run in parallel via `Promise.all` during workflow startup.","Updated 5 core prompt templates (`extensionAnalyze`, `extensionPrompt`, `secRecon`, `secPrompts`, `deepPrompt`) in `constants.ts` to inject these variables.","Modified all 5 system prompts to explicitly instruct LLMs: 'MUST reference real file paths from the provided file tree — never invent or guess paths.'"],
"pains":["No major issues encountered in this specific development session. Previous sessions (0001-0004) involved significant challenges with authentication, Server-Sent Events (SSE), and Prisma ORM configurations."],
"successes":["Successfully implemented a robust and efficient solution for LLM context grounding.","Achieved clean typechecks post-implementation.","Designed for efficiency with parallel loading of context data.","Integrated a robust branch fallback mechanism for fetching repository trees.","Directly addressed a critical LLM hallucination problem with a practical, code-driven solution."],
"techStack":["TypeScript", "Node.js", "GitHub API (Git Trees)", "LLM Workflow Engine", "Prompt Engineering", "Next.js (implied by file paths)", "PostgreSQL (implied by dev setup)"]}