Building an AI-Powered AutoFix Pipeline: From Code Analysis to Automated Pull Requests
A deep dive into building a complete automated security and bug fixing pipeline with AI-powered issue detection, unified diff patching, and GitHub integration.
Building an AI-Powered AutoFix Pipeline: From Code Analysis to Automated Pull Requests
Ever wondered what it would be like if your codebase could fix itself? After a late-night development session that wrapped up around 22:30, I'm excited to share the journey of building an AutoFix Pipeline — a system that automatically discovers security vulnerabilities and bugs, generates AI-powered fixes, and can even create pull requests for you.
The Vision: Automated Code Healing
The goal was ambitious: create a pipeline that could scan repositories, identify issues using AI, generate precise fixes, and integrate seamlessly with existing development workflows. Think of it as having a tireless code reviewer that not only spots problems but actually fixes them.
Here's what the complete system does:
- 🔍 Intelligent Issue Detection: Uses LLM-based analysis to find OWASP security issues, bugs, performance problems, and code smells
- 🛠️ AI-Powered Fix Generation: Creates precise unified diffs for each discovered issue
- 🔄 Automated Patching: Applies fixes directly to the codebase
- 🚀 GitHub Integration: Optionally creates pull requests with the fixes
- 📡 Real-time Updates: Streams progress via Server-Sent Events
- 🔗 External Webhooks: Integrates with tools like Claude Code, GitHub Copilot, and Cursor
Architecture Deep Dive
The Data Layer
The foundation starts with two key database models:
// AutoFixRun - tracks each pipeline execution
model AutoFixRun {
id String @id @default(cuid())
status String // running, completed, failed, cancelled
repository Repository @relation(fields: [repositoryId])
issues AutoFixIssue[]
createdAt DateTime @default(now())
}
// AutoFixIssue - individual problems found and fixed
model AutoFixIssue {
id String @id @default(cuid())
category String // security, bug, performance, code-smell
severity String // critical, high, medium, low
status String // pending, fixed, skipped
originalCode String
fixedCode String?
// ... relations and metadata
}
The Pipeline Engine
The heart of the system is an AsyncGenerator-based pipeline that orchestrates the entire process:
async function* autoFixPipeline(
repositoryId: string,
options: AutoFixOptions
): AsyncGenerator<AutoFixEvent> {
// Phase 1: Repository scanning
yield { type: 'scan_started' }
const files = await scanRepository(repositoryId)
// Phase 2: Issue detection using AI
yield { type: 'detection_started' }
const issues = await detectIssues(files)
// Phase 3: Fix generation
for (const issue of issues) {
yield { type: 'fix_generated', issue }
const fix = await generateFix(issue)
await applyPatch(fix.unifiedDiff)
}
// Phase 4: Optional PR creation
if (options.createPR) {
yield { type: 'pr_created', url: prUrl }
}
}
AI-Powered Issue Detection
One of the most interesting challenges was creating an AI system that could reliably identify different types of issues. The solution uses a prescriptive approach with specific categories:
const ISSUE_CATEGORIES = {
security: "OWASP Top 10 vulnerabilities, injection flaws, XSS, etc.",
bugs: "Logic errors, null pointer exceptions, race conditions",
performance: "Inefficient algorithms, memory leaks, blocking operations",
"error-handling": "Missing try-catch, unhandled promises, silent failures",
"code-smells": "Duplicated code, long methods, inappropriate naming"
}
The AI analyzes code in batches and returns structured issue reports with precise file locations and descriptions.
Unified Diff Magic
Generating fixes is only half the battle — applying them reliably is the real challenge. The system uses unified diff format for precise patching:
function applyUnifiedDiff(originalContent: string, diff: string): string {
const lines = originalContent.split('\n')
const diffLines = diff.split('\n')
// Parse diff headers and apply line-by-line changes
// Handle additions, deletions, and context preservation
return modifiedLines.join('\n')
}
This approach ensures that fixes can be applied even if the codebase changes slightly between detection and application.
Real-time User Experience
The frontend provides a rich, real-time experience using Server-Sent Events:
// SSE endpoint streams pipeline progress
export async function GET(request: Request) {
const encoder = new TextEncoder()
const stream = new ReadableStream({
start(controller) {
// Start the pipeline and stream events
for await (const event of autoFixPipeline(repoId, options)) {
const data = encoder.encode(`data: ${JSON.stringify(event)}\n\n`)
controller.enqueue(data)
}
}
})
return new Response(stream, {
headers: { 'Content-Type': 'text/event-stream' }
})
}
Users see live updates as issues are discovered, fixes are generated, and patches are applied.
Lessons Learned: The Real Development Story
Building this system wasn't without its challenges. Here are the key lessons learned:
Database Schema Evolution Pain
The Challenge: Working with PostgreSQL vector embeddings in Prisma is tricky. Every schema push warned about dropping the embedding vector(1536) column because Prisma doesn't natively support the vector type.
The Solution: Accept the data loss warning and re-add the column manually:
ALTER TABLE workflow_insights ADD COLUMN IF NOT EXISTS embedding vector(1536);
This became the standard workflow — a reminder that bleeding-edge features often require creative workarounds.
Build Tool Wrestling
The Challenge: ESLint configuration issues caused build failures across the entire codebase, blocking development.
The Solution: Use npx next build --no-lint for development builds while keeping the linting issues as a separate cleanup task.
Sometimes you need to choose your battles and maintain development momentum.
Path Resolution Gotchas
The Challenge: TypeScript couldn't resolve a relative import path (../../../../middleware) in the SSE endpoint.
The Solution: Careful path counting and following existing patterns from similar endpoints (../../../middleware).
The lesson: when in doubt, copy what works elsewhere in the codebase.
Integration Points
The system integrates with external tools through webhooks:
// External tools can mark issues as resolved
POST /api/v1/webhooks/auto-fix/resolve
{
"issueId": "clx123...",
"resolution": "applied_externally",
"tool": "cursor"
}
This allows developers using tools like Cursor or GitHub Copilot to provide feedback to the system.
What's Next?
The pipeline is feature-complete, but there's always room for improvement:
- Enhanced AI Models: Experiment with different LLMs for specialized issue types
- Learning System: Track which fixes get accepted to improve future suggestions
- Team Workflows: Add review processes for critical fixes
- Metrics Dashboard: Detailed analytics on fix success rates and impact
The Bigger Picture
This project represents a shift toward AI-assisted development workflows. We're moving from "AI helps me write code" to "AI helps me maintain and improve code." The AutoFix Pipeline is a step toward codebases that can heal themselves, freeing developers to focus on creativity and architecture rather than hunting down bugs.
The future of software development isn't just about writing code faster — it's about creating systems that make our existing code better, safer, and more maintainable. And sometimes, that future arrives at 22:30 on a Tuesday night, one commit at a time.
Want to see this in action? The complete implementation includes 7 tRPC procedures, 4 React components, real-time SSE streaming, and full GitHub integration. The technical details are in the trenches, but the vision is simple: let AI handle the tedious fixes so you can focus on building amazing things.