Unlocking Code Intelligence: A Deep Dive into Our New CKB Integration

It’s just past midnight, and I've finally hit a major milestone: full integration of our new Code Knowledge Backend (CKB). This wasn't just another feature; it was a multi-phase, full-stack endeavor to bake deep code intelligence directly into our platform. We’re talking everything from Docker containers and Prisma models to a snazzy new UI and even a custom template variable for AI prompts.

Let's break down how we got here, the technical decisions we made, and some of the thorny problems we navigated.

Phase 1: Laying the Foundation – The Core CKB Integration

The CKB is designed to be an external, heavy-lifting analysis tool. Our primary goal in Phase 1 was to integrate it seamlessly into our existing backend architecture.

Dockerizing the Brains

First up, getting the CKB itself running. We opted for a dedicated Docker worker container (ghcr.io/simplyliz/ckb:latest). This keeps the CKB isolated and scalable.

yaml

# Excerpt from docker-compose.yml
ckb:
  image: ghcr.io/simplyliz/ckb:latest
  command: ["sleep", "infinity"] # Keep it alive, we'll `docker exec` into it
  volumes:
    - ckb_repos:/app/repos # Shared volume for codebases
  healthcheck:
    test: ["CMD", "ckb", "version"]
    interval: 30s
    timeout: 10s
    retries: 3

The sleep infinity command is a classic pattern for worker containers we intend to interact with via docker exec. This way, the container stays up, but doesn't consume CPU until we explicitly tell it to do something. A shared volume (ckb_repos) was crucial for the CKB to store and access cloned repositories. And, of course, a healthcheck ensures we know the CKB binary is actually available.

Data Modeling with Prisma

To track the analysis status and cache results for each project, we introduced a new ProjectCkbIndex model in Prisma. This model links directly to our Project and Tenant entities, ensuring proper data relationships and enabling our Row-Level Security (RLS) policies.

prisma

// prisma/schema.prisma
model ProjectCkbIndex {
  id           String    @id @default(uuid())
  projectId    String    @unique
  project      Project   @relation(fields: [projectId], references: [id], onDelete: Cascade)
  status       CkbStatus @default(PENDING)
  analysisCache Json?
  createdAt    DateTime  @default(now())
  updatedAt    DateTime  @updatedAt
  tenantId     String
  tenant       Tenant    @relation(fields: [tenantId], references: [id])

  @@index([tenantId])
}

enum CkbStatus {
  PENDING
  PROCESSING
  COMPLETED
  FAILED
}

The analysisCache field, a Json? type, is where we store the aggregated results of various CKB analyses. This allows us to serve cached data quickly without re-running computations on every request.

The CKB Client Service

This was the heart of the integration: src/server/services/ckb-client.ts. This service acts as our internal wrapper, orchestrating docker exec commands to interact with the CKB container. It handles repository cloning, pulling, deletion, and exposes 13 distinct analysis functions (e.g., architecture, hotspots, coupling, complexity). A central runFullAnalysis() method orchestrates a sequential execution of these.

typescript

// Excerpt from src/server/services/ckb-client.ts
// A simplified example of how we shell out to Docker
async function runCkbCommand(
  projectId: string,
  command: string,
  args: string[]
): Promise<string> {
  const containerName = process.env.CKB_CONTAINER_NAME || 'nyxcore-ckb-1';
  // Basic path validation to prevent traversal attacks
  validateProjectPath(projectId); 

  const { stdout, stderr } = await execFile(
    'docker',
    ['exec', containerName, 'ckb', command, ...args],
    { timeout: CKB_COMMAND_TIMEOUT }
  );

  if (stderr) {
    logger.warn(`CKB command stderr for project ${projectId}: ${stderr}`);
  }
  return stdout;
}

Security was paramount here, especially with docker exec and arbitrary project IDs. Robust path validation (validateProjectPath) was implemented to prevent any potential path traversal vulnerabilities.

A Robust tRPC API

To expose CKB functionality to our frontend, we built a comprehensive tRPC router (src/server/trpc/routers/ckb.ts). This includes procedures for checking status, triggering re-indexing, and fetching cached analysis results. Critically, some analyses like coupling or callGraph are run live by the CKB client for real-time insights based on user input.

Intelligent Content Loading with `{{ckb}}`

One of the cooler features is integrating CKB insights directly into our AI prompt engine. We introduced a {{ckb}} template variable that the workflow engine resolves. This allows users to dynamically inject relevant code insights (like security audit findings or hotspots) into their AI prompts, providing crucial context.

This content is loaded via src/server/services/ckb-content-loader.ts, which includes Redis caching (1-hour TTL, 8K char max) for performance and invalidateCkbCache() for freshness. We also added specific formatting and truncation logic, with a keen eye on security to ensure sensitive data (like full file paths in audit findings) isn't accidentally leaked.

Auto-Indexing & RLS

Finally, we wired up automatic CKB indexing to project creation/updates when a GitHub repository is linked. This "fire-and-forget" operation ensures that projects are analyzed without manual intervention. On the security front, Row-Level Security (tenant_isolation_project_ckb_indexes policy) was applied to the new project_ckb_indexes table, ensuring strict multi-tenant data isolation.

Phase 2: Bringing it to Life – The Code Intelligence Page

With the backend humming, Phase 2 was all about making these insights accessible and actionable for our users. This culminated in the new Code Intelligence tab (src/components/projects/code-intelligence-tab.tsx).

This 500+ line component is a powerhouse, offering a holistic view of a project's codebase:

Overview Cards: Quick summaries of Architecture (module/layer count), Hotspots (top 10 by risk, color-coded), Security Audit (severity breakdown + finding details), and Dead Code (list of unused symbols).
Detail Sections: Deeper dives into specific aspects:
- Coupling Analysis: Search for a file and see its co-change partners.
- File Complexity: Cyclomatic and cognitive complexity metrics per function.
- Ownership: Author percentages for files and modules.
Robust UX: We built in graceful degradation for various states: "CKB not configured," "Link a GitHub repository," a processing spinner during analysis, and clear error displays. A "Re-analyze" button triggers the ckb.reindex mutation, giving users control.

Adding this tab to our dashboard/projects/[id]/page.tsx was the final touch, making Code Intelligence a first-class citizen in our project views.

Navigating the Treacherous Waters: Lessons Learned

No complex integration goes off without a hitch. Here are a few "gotchas" and the solutions we implemented:

1. Prisma Client Type Inference in Helper Functions

Problem: When creating helper functions for our tRPC router, I initially tried to use import("@prisma/client").PrismaClient as an inline type for the Prisma parameter. This quickly became verbose and fragile, especially with nested types.
Solution: After feedback, we opted for any with an eslint-disable comment for the Prisma parameter in helper functions. This might sound counter-intuitive, but tRPC's context inference ensures type safety at the call sites, making the helper's internal type less critical for overall system safety and significantly reducing boilerplate.
Takeaway: Sometimes, pragmatic workarounds are necessary when type inference is strong at the boundaries, especially if a more robust type utility isn't immediately feasible.

2. The Elusive `/s` Regex Flag in Production

Problem: Our harden-persona-prompts.ts script used the /s regex flag (dotAll) for pattern matching. This worked fine locally, but production builds failed with TS1501: This regular expression flag is only available when targeting 'es2018' or later. Our tsconfig was set to es2017.
Solution: Replaced /pattern/s with /pattern[\\s\\S]*/. This [\\s\\S] character class is a well-known equivalent for dotAll behavior that works across older JavaScript environments.
Takeaway: Always be mindful of your target ES version in tsconfig.json and verify language feature compatibility, especially for newer syntax like regex flags.

3. Vitest Mock Hoisting with `child_process.execFile`

Problem: When trying to mock child_process.execFile in our ckb-client.test.ts using vi.fn(), Vitest's vi.mock() factories were hoisting above variable declarations, leading to "variable not defined" errors.
Solution: We used vi.hoisted() to explicitly declare the mock function, ensuring it's available before the vi.mock call.
Takeaway: Understanding your test runner's mocking and hoisting mechanisms is crucial. vi.hoisted() is your friend for intricate mocking scenarios in Vitest.

4. `for...of` and `--downlevelIteration`

Problem: Using for...of on Array.entries() in the content loader caused issues because our tsconfig didn't enable --downlevelIteration.
Solution: Switched to the more traditional forEach((item, index) => ...) loop.
Takeaway: TypeScript's --downlevelIteration compiler option is important for for...of loops targeting older ES versions. If it's not enabled, stick to forEach or ensure your target is modern enough.

Where We Stand & What's Next

The good news is that Phase 1 and Phase 2 are fully deployed to production! The project_ckb_indexes table exists, RLS is active, and our 17 new CKB-related tests are all passing.

Immediate next steps:

Activate CKB Container: The CKB container is defined but needs to be started on production (docker compose -f docker-compose.production.yml up -d ckb).
Phase 3: Webhook Auto-Reindex: This is the next big push. We'll implement a webhook endpoint (POST /api/v1/webhooks/ckb) for GitHub push events to trigger automatic re-indexing, keeping analyses fresh.
Webhook UI & Secrets: A UI for generating and setting up webhook secrets will be needed.
PR Summaries: Leveraging the CKB for automatic PR summaries on pull_request events.
CKB Image Verification: A quick check to ensure ghcr.io/simplyliz/ckb:latest is robust and working as expected.
End-to-End Test: The ultimate validation: link a repo, trigger a reindex, and verify the UI updates correctly.

This integration marks a huge leap forward in our platform's ability to provide deep, actionable insights into codebases. It was a challenging but incredibly rewarding journey, and I'm excited to see the impact it has on our users.

json

{"thingsDone":[
    "Full CKB (Code Knowledge Backend) integration deployed to production (Phase 1 & 2)",
    "Docker CKB worker container setup with shared volume and healthcheck",
    "Prisma `ProjectCkbIndex` model with `@unique`, `Json?` cache, and `CkbStatus` enum",
    "Service layer (`ckb-client.ts`) for `docker exec` interaction, repo management, and 13 analysis functions",
    "Comprehensive tRPC API (`ckb.ts`) for all CKB procedures",
    "Template variable `{{ckb}}` wired into workflow engine with Redis caching and secure formatting",
    "Automatic CKB indexing on project create/update for GitHub-linked repos",
    "Row-Level Security (RLS) applied to `project_ckb_indexes` table",
    "Full Code Intelligence UI page (`code-intelligence-tab.tsx`) with overview cards and detailed analysis sections",
    "Graceful degradation and robust UX for CKB states in the UI",
    "Critical bug fixes including regex `/s` flag compatibility and Vitest mocking"
],"pains":[
    "Fragile inline Prisma client type imports in helper functions",
    "Production build failure due to `/s` regex flag compatibility with `es2017` target",
    "Vitest `vi.fn()` hoisting issues when mocking `child_process.execFile`",
    "`for...of` iteration issues due to missing `--downlevelIteration` in `tsconfig`"
],"successes":[
    "Successfully integrated a complex external service (CKB) into a full-stack application",
    "Developed a robust and secure `docker exec`-based client service",
    "Implemented a dynamic templating system (`{{ckb}}`) for AI context injection",
    "Created a rich, interactive Code Intelligence UI with multiple analysis views",
    "Applied best practices like RLS, caching, and graceful degradation",
    "Successfully debugged and resolved critical build and testing issues, leading to actionable lessons"
],"techStack":[
    "Docker",
    "Docker Compose",
    "Prisma",
    "PostgreSQL",
    "TypeScript",
    "Next.js",
    "tRPC",
    "React",
    "Redis",
    "Vitest",
    "Node.js",
    "child_process.execFile",
    "GitHub API (implicit for repo linking)"
]}