Context is King: Empowering LLMs with Project Wisdom for Smarter AutoFix and Refactoring
We've unlocked a new level of intelligence for our AI-powered AutoFix and Refactor pipelines by injecting rich, project-specific context directly into LLM prompts. This post details the journey, from schema changes to UI updates, and the critical lessons learned along the way.
Imagine an AI assistant that doesn't just offer generic coding advice, but truly understands the nuances of your project. An AI that knows your team's discussions, the specific design decisions, past issues, and even the "wisdom" embedded in your internal documentation. This was the vision driving our latest major feature: making our AutoFix and Refactor pipelines deeply project-aware.
Today, I'm thrilled to announce that this vision is now a reality. Our AI-powered detection and generation pipelines are no longer operating in a vacuum. They're now fed a consolidated stream of project knowledge, leading to vastly more relevant and insightful suggestions.
The Quest for Project Context
Our AutoFix and Refactor tools leverage Large Language Models (LLMs) to detect issues and suggest improvements. While powerful, early iterations sometimes offered suggestions that, while technically correct, might not align with a project's specific architecture, historical decisions, or ongoing discussions. The missing piece was context.
Our goal was clear: inject consolidated project knowledge – ranging from internal wisdom and documentation to team discussions, previous reports, and memory insights – directly into the LLM prompts. This would transform generic responses into truly intelligent, project-specific actions.
Building the Brain: How We Assembled Project Knowledge
The journey began by establishing a clear link between our automation runs and specific projects.
1. Database Foundations: Tying Runs to Projects
First, we extended our prisma/schema.prisma. We added a projectId foreign key to both AutoFixRun and RefactorRun models, establishing a clear one-to-many relationship with our Project model. This was the fundamental step to ensure every pipeline execution could be associated with its respective project.
model Project {
id String @id @default(uuid())
name String
autoFixRuns RefactorRun[]
refactorRuns AutoFixRun[]
// ... other project fields
}
model AutoFixRun {
id String @id @default(uuid())
projectId String
project Project @relation(fields: [projectId], references: [id])
// ... other autofix run fields
}
model RefactorRun {
id String @id @default(uuid())
projectId String
project Project @relation(fields: [projectId], references: [id])
// ... other refactor run fields
}
2. The Context Orchestrator: pipeline-context.ts
The heart of this feature is src/server/services/pipeline-context.ts. This service is responsible for aggregating information from five distinct sources:
- Wisdom: Curated best practices or specific project guidelines.
- Memory: Insights derived from previous runs or user feedback.
- Discussions: Relevant snippets from team conversations (e.g., Slack, GitHub issues).
- Documentation: Key sections from project documentation.
- Previous Runs: Historical data from past AutoFix or Refactor executions on the same project.
We cap this consolidated knowledge at approximately 30,000 characters to ensure it fits within typical LLM context windows without excessive token usage.
3. Injecting Context Downstream
With the context assembly in place, the next step was to ensure it flowed through our entire system:
- API Extension: We updated the
startmutations in our tRPC routers (src/server/trpc/routers/auto-fix.tsandrefactor.ts) to acceptprojectId,memoryIds, andcontextSourcesas inputs. This allows the frontend to specify exactly what context to use. - Pipeline Orchestration: Both our
auto-fix/pipeline.tsandrefactor/pipeline.tsorchestrators now callloadPipelineContext()at the start of a run, fetching the relevant project knowledge. - LLM Prompt Injection: Crucially, all four of our core LLM components –
issue-detector.ts,fix-generator.ts,opportunity-detector.ts, andimprovement-generator.ts– were updated to accept and inject thisprojectContextdirectly into their prompts. This is where the magic happens, allowing the LLMs to "think" with project-specific knowledge. - Streaming Context: Our SSE routes (
api/v1/events/auto-fix/[id]/route.tsandrefactor/[id]/route.ts) were also updated to pass configuration fields, ensuring the client can track which context was used for a given run.
4. A Smarter Frontend Experience
To make this power accessible, we revamped our frontend:
- Project Selector: Both the AutoFix and Refactor list pages now feature a project selector dropdown, allowing users to filter runs by project.
- Context Source Toggles: Users can now explicitly choose which context sources (wisdom, memory, docs, etc.) to include for a given run using toggle chips.
- Collapsible Memory Picker: A dedicated component allows users to select specific memory insights to include, providing fine-grained control.
- Context Badges: On detail pages, clear badges display the project name and icons for active context sources, along with the memory count, making it transparent what information informed the AI's suggestions.
After all these changes, a final db:push and db:generate were run, and our typecheck passed clean – a testament to TypeScript's guardrails!
Lessons Learned & Critical Fixes
No complex feature development is without its challenges. Here are a couple of key "gotchas" we navigated:
1. Prisma and Custom SQL Types: The vector Column Dilemma
Challenge: After adding projectId to our Prisma schema, a routine npm run db:push command warned us about dropping the embedding vector(1536) column on our workflow_insights table. This column is critical for our semantic search capabilities, but vector(1536) is a custom type managed via raw SQL, not natively by Prisma.
Workaround: The only way to proceed with db:push was to use the --accept-data-loss flag. However, this did drop the column. The critical step was to immediately restore it with a raw SQL command:
ALTER TABLE workflow_insights ADD COLUMN IF NOT EXISTS embedding vector(1536);
And then, recreate its HNSW index using npx prisma db execute --stdin.
Lesson: When mixing Prisma with custom SQL types (like vector for embeddings), always be prepared to manually restore or re-create these specific columns and their associated indexes after a db:push that involves schema changes. This is a recurring issue we've documented for future reference.
2. Frontend Data Structure: trpc.projects.list Returns an Object
Challenge: In our frontend pages, we initially tried to map over the results of trpc.projects.list like projects.data?.map(). TypeScript quickly flagged this, indicating that trpc.projects.list returns an object with a structured payload, not a direct array.
Workaround: The fix was straightforward: access the items property of the returned data.
// Before (failed)
// projects.data?.map(...)
// After (fixed)
projects.data?.items.map(...)
Lesson: Always double-check the exact return type of your tRPC queries. While map is common, many list queries return paginated or structured objects, requiring access to a specific property (like items) to get the actual array.
What's Next?
With the core feature now implemented and pushed to main, our immediate focus is on rigorous QA:
- Verifying backward compatibility for AutoFix scans without project context.
- Thoroughly testing AutoFix and Refactor scans with project context, ensuring repo filtering, context toggles, memory picker, and context badges all function as expected.
Beyond that, we'll be tackling RLS policies for our project_notes table and cleaning up some .gitignore entries for mini-rag log files.
This feature marks a significant leap forward in making our AI tools truly intelligent and deeply integrated into the developer workflow. We're excited to see the impact of these project-aware pipelines on developer productivity and code quality!