From Codebase to Clarity: Architecting Ipcha's New Repo-Level Audit Engine

It was a late afternoon, a date with destiny (or at least, a new feature design), and the air was thick with the hum of possibilities. At Ipcha, our self-testing system, we've been pushing the boundaries of automated quality assurance. We started with individual tests, then moved to comprehensive reporting. Now, it's time to level up: introducing repo-level audit and stress testing.

This past session was purely design-focused, a deep dive into how we'd build a system capable of not just running tests, but truly understanding and evaluating the health of an entire codebase. The goal was ambitious: design and implement a robust system that could audit our repositories, identify potential issues, and even stress-test them. I'm thrilled to report that the design is now approved and committed, laying a solid foundation for the implementation phase.

The Mission: A Smarter Codebase Guardian

Our existing Ipcha system is fantastic for ensuring individual components are working as expected. But what about the bigger picture? How do we proactively identify areas of technical debt, detect breaking changes before they become critical, or simply understand the overall "health" of a repository? This is where repo-level auditing comes in. It’s about moving from reactive bug fixing to proactive codebase stewardship.

The design phase was intense, involving answering six critical design questions that shaped the architecture. The result is a detailed plan, now living at docs/plans/2026-03-09-repo-audit-design.md (commit a350ad8).

Architecting the Audit Engine: Key Design Decisions

Building a system to audit an entire repository isn't trivial. Here's a look at some of the key decisions we made to ensure efficiency, accuracy, and scalability:

1. Hybrid File Sourcing: Local & Remote Agility

We recognized that our code lives in different places. For internal, core components (like nyxCore), we'll leverage the local file system for speed and direct access. For linked project repositories, however, we'll integrate directly with the GitHub API. This hybrid approach gives us the best of both worlds: deep local introspection and broad remote reach.

2. The Art of Smart File Prioritization

Auditing every single file in a large repository can be resource-intensive and often unnecessary. We needed a smarter way to focus our efforts. Our solution? A hybrid file selection mechanism:

User-defined Globs: Developers can specify patterns (e.g., src/**/*.ts) to target specific areas.
Intelligent Prioritization: This is where it gets interesting. We'll use a weighted system to prioritize files that are most likely to benefit from an audit:
- Churn (0.35): Files that change frequently often introduce new bugs or complexity.
- Size (0.25): Large files can be harder to maintain and understand.
- Imports (0.20): Files with many dependencies might indicate tightly coupled code.
- Staleness (0.20): Older files that haven't been touched in a while might contain outdated practices or subtle issues.

This prioritization ensures our audit resources are spent where they matter most, providing maximum impact.

3. Layered Analysis for Efficiency

Once files are selected, how do we analyze them without grinding to a halt? We adopted a layered approach:

Summary Pass First: An initial, lightweight pass across all prioritized files to quickly flag potential areas of concern.
Deep-Dive on Flagged Files: Only files identified as potentially problematic during the summary pass will undergo a more thorough, resource-intensive deep-dive analysis. This prevents unnecessary deep analysis of perfectly healthy code.

4. Two-Tier Workflows with AI Augmentation

To execute these analyses, we designed a two-tier workflow:

Tier 1 (Batched by Directory): For the summary pass, we'll process files in batches, grouped by directory. This helps in understanding contextual issues within a module or component.
Tier 2 (Per-File with Axiom RAG Chunking): For the deep-dive, each flagged file will be analyzed individually. Crucially, we'll leverage Axiom RAG (Retrieval Augmented Generation) chunking. This allows our AI-powered analysis to pull in relevant context (e.g., related code, documentation, previous audit findings) to provide more accurate and insightful feedback on individual files.

5. Seamless Integration with Existing Ipcha Targets

We didn't want to reinvent the wheel for target management. The new repo-level audit will integrate as an extension of our existing target system. We'll introduce a new repo target type, where the source field will store a JSON configuration detailing the repository URL, branches, and specific audit settings. These repo targets will seamlessly join the existing targetsPerRun rotation, ensuring they are regularly audited as part of Ipcha's normal operations.

Smooth Sailing: A Design Session Without Major Hitches

One of the most encouraging aspects of this session was the complete absence of major issues. This wasn't a "pain-free" session by accident; it's a testament to the thorough groundwork laid in previous brainstorming and planning, and the collaborative focus of the team. We were able to address all six core design questions systematically, leading to a clear, coherent, and approved design document. Sometimes, a smooth design process is a success story in itself!

What's Next: Bringing the Vision to Life

With the design locked in, the immediate next steps are all about execution.

Implementation Plan: Our first task is to formalize the implementation plan, leveraging our superpowers:writing-plans skill for a structured approach.
Subagent-Driven Development: We're excited to tackle the implementation using our subagent-driven development approach, allowing us to rapidly prototype and build out the new services.
New Files & Modifications: Expect to see new services like repo-audit-service.ts and file-prioritizer.ts, along with their respective tests. We'll also be modifying existing core components like audit-service.ts, the audit.ts router, workflow-engine.ts, and the Ipcha page itself to integrate the new functionality.
Schema Evolution: Our database schema will evolve to support the new audit capabilities, adding fields like parentRunId, tier, and filePath to the AuditRun model for richer tracking.
Production Readiness: Essential housekeeping items from previous sessions include setting the AUDIT_CRON_SECRET on our production server and configuring the cron job to trigger hourly audits. We've also provisioned three new tables in production: audit_targets, audit_schedules, and audit_runs.

Beyond this immediate rollout, we still have fascinating brainstorm topics on the horizon, including "persona rental" and "CKB integration" – but those are stories for another day!

We're incredibly excited to move into the implementation phase. This repo-level audit engine will significantly enhance Ipcha's ability to maintain high code quality, identify issues proactively, and ultimately, help us build better software faster. Stay tuned for updates as we bring this vision to life!