Building a Live 'Stats for Nerds' Panel: Real-time Pipeline Metrics That Actually Matter
How we built a live dashboard showing token usage, costs, and performance metrics for our AI-powered code analysis pipelines—and why visibility into AI operations is crucial for modern development tools.
Building a Live 'Stats for Nerds' Panel: Real-time Pipeline Metrics That Actually Matter
Ever wondered what's happening under the hood when your AI-powered development tools are churning away? As developers, we love our metrics—especially the nerdy ones that show exactly how our systems are performing. That's why we built a live "Stats for Nerds" panel that gives real-time visibility into our AutoFix and Refactor pipelines.
The Problem: AI Operations in the Dark
Modern development tools increasingly rely on Large Language Models (LLMs) to analyze code, detect issues, and generate fixes. But here's the thing—these operations are often black boxes. Developers using these tools have no idea:
- How many tokens are being consumed
- What the operations actually cost
- Which models are being used
- How long each phase takes
- Whether the system is being efficient
This lack of visibility makes it impossible to optimize performance, understand costs, or even debug when things go wrong.
The Solution: Real-time Metrics That Matter
We decided to build a collapsible "Stats for Nerds" panel that shows live metrics during pipeline execution. When collapsed, it gives you the essentials at a glance:
12.4k tok · $0.0312 · 5 calls
When expanded, you get the full picture:
- Token usage across all LLM calls
- Cost estimates for the entire operation
- Model information (which LLM provider and version)
- Energy consumption estimates
- Time saved calculations
- Per-phase breakdown with live progress indicators
The Technical Architecture
Data Structure Design
First, we created a shared interface to standardize our metrics:
// src/types/nerd-stats.ts
export interface NerdStatsData {
totalTokens: number;
totalCost: number;
totalCalls: number;
model?: string;
provider?: string;
phases: {
[phaseName: string]: {
tokens: number;
cost: number;
calls: number;
startTime: number;
endTime?: number;
isActive: boolean;
}
}
}
This structure captures both global totals and per-phase breakdowns, which is crucial for understanding where time and resources are being spent.
Pipeline Integration
The real magic happens in the pipeline orchestrators. We added metric collection at every step without disrupting the existing flow:
// Simplified example from the AutoFix pipeline
class AutoFixPipeline {
private nerdStats: NerdStatsData = {
totalTokens: 0,
totalCost: 0,
totalCalls: 0,
phases: {}
};
private markPhaseStart(phaseName: string) {
this.nerdStats.phases[phaseName] = {
tokens: 0, cost: 0, calls: 0,
startTime: Date.now(),
isActive: true
};
}
private accumulateNerd(phaseName: string, result: LLMCompletionResult) {
const phase = this.nerdStats.phases[phaseName];
phase.tokens += result.tokenUsage || 0;
phase.cost += result.costEstimate || 0;
phase.calls += 1;
// Update totals
this.nerdStats.totalTokens += result.tokenUsage || 0;
this.nerdStats.totalCost += result.costEstimate || 0;
this.nerdStats.totalCalls += 1;
}
}
Real-time Updates via Server-Sent Events
The beauty of this system is that it works with our existing Server-Sent Events (SSE) infrastructure. Every event we emit now includes the current nerdStats, so the UI updates in real-time as the pipeline progresses.
Smart Persistence
Here's a neat trick: instead of adding new database columns, we store the final nerdStats inside the existing JSON stats column. This means:
- No database migrations required
- Backward compatibility with existing runs
- Easy to extend with new metrics later
The UI Component
The NerdStats component is designed for both quick glances and deep dives:
export function NerdStats({ data, className }: NerdStatsProps) {
const [isExpanded, setIsExpanded] = useState(false);
if (!data) return null;
return (
<Card className={className}>
<CardHeader
className="cursor-pointer"
onClick={() => setIsExpanded(!isExpanded)}
>
{/* Collapsed view: essential metrics */}
<div className="text-sm text-muted-foreground">
{formatNumber(data.totalTokens)} tok ·
${data.totalCost.toFixed(4)} ·
{data.totalCalls} calls
</div>
</CardHeader>
{isExpanded && (
<CardContent>
{/* Expanded view: detailed grid + per-phase table */}
<MetricsGrid data={data} />
<PhaseBreakdown phases={data.phases} />
</CardContent>
)}
</Card>
);
}
The per-phase table even shows pulsing indicators for active phases, giving users a clear sense of progress.
Lessons Learned
1. Leverage Existing Infrastructure
The biggest win was realizing we didn't need to build new real-time infrastructure. Our existing SSE system handled metric updates perfectly once we added nerdStats to our event payloads.
2. Design for Extensibility
By using a flexible JSON structure for metrics storage, we can easily add new metrics (memory usage, API latency, etc.) without database changes.
3. Progressive Disclosure Works
The collapsed/expanded pattern means power users get the detail they want without overwhelming casual users. The compact view shows just enough to be useful.
4. Backward Compatibility Matters
Older pipeline runs don't have nerdStats, so we gracefully handle missing data rather than showing broken UI.
The Impact
Since launching this feature, we've seen several benefits:
- Debugging is faster: When a pipeline seems slow, we can immediately see which phase is the bottleneck
- Cost awareness: Developers can see the real cost of their operations and make informed decisions
- Performance optimization: We've identified several opportunities to reduce token usage based on the metrics
- User confidence: Seeing real-time progress reduces anxiety during long-running operations
What's Next
This pattern worked so well that we're planning to extend it to our Code Analysis pipeline. The same architecture should work with minimal changes—capture metrics from provider.complete() calls and accumulate them in the orchestrator.
We're also considering adding more advanced metrics like:
- API response times
- Memory usage per phase
- Cache hit rates
- Quality scores for generated fixes
Try It Yourself
If you're building AI-powered developer tools, consider adding similar visibility. The key principles are:
- Capture metrics at the source (where you call the LLM APIs)
- Accumulate progressively throughout your pipeline
- Use existing real-time infrastructure when possible
- Design for both quick glances and deep dives
- Store metrics for historical analysis
Your users will thank you for the transparency, and you'll gain invaluable insights into how your AI systems actually perform in production.
Want to see this in action? Check out our AutoFix pipeline where this system is running in production.