nyxcore-systems
4 min read

Building a Self-Improving Code Pipeline: How We Added Memory to Our AI Development Tools

A deep dive into building a closed-loop learning system that makes AI code pipelines smarter over time by learning from their own execution history.

ai-developmentmachine-learningpipelinesautomationtypescriptreact

Building a Self-Improving Code Pipeline: How We Added Memory to Our AI Development Tools

What if your development tools could learn from experience, just like human developers do? That's exactly what we set out to build—a system where our AI-powered code pipelines don't just execute tasks, but actually get smarter with each run by remembering what worked (and what didn't).

The Vision: Pipelines That Learn

We started with two solid AI pipelines: AutoFix (for detecting and fixing code issues) and Refactor (for identifying and implementing code improvements). They worked well, but each run was isolated—there was no memory, no learning from past successes or failures.

Our goal was ambitious: create a closed-loop learning system where pipeline findings automatically feed back into memory to improve future runs. Think of it as giving our AI tools the ability to build up expertise over time.

What We Built: Four Key Features

1. Provider Flexibility

First, we needed to make our pipelines more flexible. We added a clean UI for choosing between different LLM providers and models:

typescript
// Added provider selection to both AutoFix and Refactor dialogs
const LLM_PROVIDERS = [
  { id: 'openai', name: 'OpenAI' },
  { id: 'anthropic', name: 'Anthropic' },
  { id: 'local', name: 'Local' }
];

Now users can experiment with different models and see provider badges throughout the UI, making it clear which AI powered each analysis.

2. Better User Experience

We tackled a frustrating phase synchronization bug where the UI would always show "scanning" on page load, even for completed runs. The fix was elegantly simple:

typescript
// Sync UI phase with actual run status
useEffect(() => {
  setCurrentPhase(statusToPhase(run.status));
}, [run.status]);

Sometimes the best solutions are the obvious ones we initially overlook.

3. End-to-End Automation

We extended our Refactor pipeline with automatic PR creation. When the system identifies improvements and generates patches, it can now automatically create pull requests in the target repository:

typescript
// Phase 4: Automatic PR creation for single-file changes
if (autoCreatePR && patches.length === 1) {
  const prResult = await createPullRequest(repoUrl, patch);
  await updateRefactorItem(id, { 
    prUrl: prResult.url, 
    prNumber: prResult.number 
  });
}

We kept it conservative—only auto-creating PRs for single-file changes to avoid overwhelming developers with massive multi-file pull requests.

4. The Learning Loop: Where Magic Happens

Here's the crown jewel—our closed-loop learning system. Every time a pipeline completes, it now extracts insights and stores them with vector embeddings for future retrieval:

typescript
// Extract insights after pipeline completion
const insights = await extractPipelineInsights({
  runId: run.id,
  repoUrl,
  results: completedItems,
  workflowType: 'autofix' // or 'refactor'
});

// Store with vector embeddings for semantic search
await storeInsights(insights);

When a new pipeline starts, it searches for relevant historical learnings and injects them into the AI prompts:

typescript
// Inject historical context into prompts
const historicalLearnings = await searchPipelineLearnings({
  repoUrl,
  workflowType: 'autofix',
  limit: 5
});

const enhancedPrompt = `
${basePrompt}

## Historical Learnings
${formatLearnings(historicalLearnings)}
`;

Lessons Learned: The Messy Reality

Not everything went smoothly. Here are the key challenges we faced:

Database Schema Flexibility

Our biggest hurdle was storing pipeline insights in a table originally designed for workflow-specific insights. Pipeline insights don't belong to a specific workflow, which broke our foreign key constraints.

The solution: Make the workflowId field optional in our Prisma schema. Sometimes the best architecture is the most flexible one:

prisma
model WorkflowInsight {
  id         String    @id @default(cuid())
  workflowId String?   // Made optional for pipeline insights
  workflow   Workflow? @relation(fields: [workflowId], references: [id])
  // ... other fields
}

JSON Field Handling

Working with Prisma's Json fields required careful type casting throughout our codebase:

typescript
// Client-side type assertions for JSON config fields
const config = run.config as Record<string, string>;

It's verbose, but explicit typing prevents runtime surprises.

State Management Complexity

The phase synchronization bug taught us that client-side state management gets tricky when you have multiple sources of truth. The UI state, database state, and real-time updates all need to stay in sync.

The Results: Smarter Pipelines

After implementing all four features, we now have:

  • Flexible AI provider selection with clear UI indicators
  • Reliable phase tracking that stays in sync with actual execution
  • Automated PR creation for streamlined workflows
  • Self-improving pipelines that learn from historical data

The learning loop is particularly exciting. On subsequent runs, you'll see messages like "Loaded historical learnings from 3 similar analyses" in the execution stream, indicating the AI is now building on past experience.

What's Next

We're already thinking about improvements:

  1. Deduplication logic to avoid storing redundant insights from repeated runs
  2. Learnings dashboard to visualize what the system has learned over time
  3. Cross-repository learning to share insights between similar codebases

The foundation is solid, and the possibilities are endless. We've built not just better tools, but tools that get better on their own.


Want to see this in action? The complete implementation spans 11 files with 558 lines of new code, all committed and ready to make your development pipelines smarter. The future of AI-assisted development isn't just about better models—it's about systems that learn and improve from real-world usage.