Building Smarter Developer Tools: From UI Polish to Self-Learning Systems

What started as a simple UI enhancement session turned into architecting a self-improving AI system. Here's how a routine developer tool update evolved into something much more ambitious.

The Simple Ask: Give Users Choice

The initial goal was straightforward—add provider and model selection to our AutoFix and Refactor dialogs. Users wanted control over which AI model powered their code improvements, and honestly, it seemed like a quick win.

What We Built

The implementation touched several key areas:

Dialog Enhancement: Added LLM_PROVIDERS button groups and model inputs to both auto-fix/page.tsx and refactor/page.tsx
Visual Feedback: Provider/model badges now appear in detail headers and run list cards
Backend Integration: Connected the UI choices to existing tRPC mutations that were already prepared to accept provider/model parameters

typescript

// The backend was ready—we just needed to wire up the frontend
const { mutate: startAutoFix } = trpc.autoFix.start.useMutation({
  onSuccess: (result) => {
    router.push(`/auto-fix/${result.id}`);
  }
});

// Now with provider/model selection
startAutoFix({ 
  provider: selectedProvider, 
  model: selectedModel,
  // ... other params
});

This felt like the kind of incremental improvement that makes users happy—more control, better visibility, clean implementation.

The Plot Twist: A Subtle Bug

But then we hit a classic developer experience issue. Everything worked perfectly... until you navigated away from a running job and came back.

The Problem

Users would start a refactor, see it progress to "improving" phase, navigate to check something else, then return to find the UI showing "detecting" phase instead. The backend was correct, the real-time updates worked fine, but the initial page load was lying.

The Root Cause

typescript

// The problematic line
const [currentPhase, setCurrentPhase] = useState<RefactorPhase>("scan");

We were hardcoding the initial phase instead of reading from the actual run status. When Server-Sent Events (SSE) were flowing, they'd correct the phase. But on a fresh page load? Wrong phase until the next SSE event arrived.

The Fix

typescript

// Added proper phase synchronization
useEffect(() => {
  if (run?.status) {
    const correctPhase = statusToPhase[run.status];
    setCurrentPhase(correctPhase);
  }
}, [run?.status]);

Simple fix, but it highlighted something important: state synchronization between real-time updates and initial page loads is tricky. SSE events handle the dynamic updates, but you still need to bootstrap correctly from persisted state.

The Bigger Vision: Self-Learning Systems

With the UI improvements wrapped up, we pivoted to something more ambitious: What if our developer tools could learn from their own usage patterns?

The Learning Loop Concept

Instead of static AI prompts that never improve, we're building a closed-loop system:

Capture Insights: Extract patterns from successful (and failed) code transformations
Store Knowledge: Build a memory system that persists learnings across runs
Apply Intelligence: Inject relevant insights into future pipeline prompts
Iterate: Continuously refine based on outcomes

Research-Driven Development

Rather than diving straight into implementation, we created a structured research approach:

Task 1: Architecture research for insight extraction and memory systems
Task 2: Pipeline data flow analysis to identify feedback points
Tasks 3-6: Implementation phases that depend on research findings

This feels like the right approach for a complex system. The temptation is always to start coding immediately, but investing in upfront research prevents architectural dead ends.

Lessons Learned

1. State Management Across Time

Real-time applications need to handle three scenarios:

Initial page load (bootstrap from persisted state)
Live updates (handle streaming events)
Navigation returns (reconcile stale UI with current reality)

Don't assume your real-time update logic covers all cases.

2. Type Safety with Dynamic Data

Working with Prisma's Json? fields requires careful type casting:

typescript

// Safe access to nested config properties
const provider = (run.config as Record<string, string> | null)?.provider;

The type system can't infer the structure of JSON fields, so explicit casting becomes necessary.

3. Research Before Architecture

For complex systems like learning loops, resist the urge to code first. Structured research tasks help you understand the problem space before committing to implementation approaches.

What's Next

The provider selection feature is live and working well. The phase synchronization bug is fixed and ready to deploy.

But the really exciting work is just beginning—building AI systems that get smarter over time by learning from their own outputs. It's the difference between static tools and evolving intelligence.

The next phase involves spawning research agents to dive deep into memory architectures and pipeline data flows. Once we understand the landscape, we'll design a feedback loop that makes every code transformation teach the system something new.

Sometimes the best development sessions are the ones that start with a simple UI tweak and end with a vision for fundamentally smarter tools.