Unlocking Code Intelligence: Our Journey Integrating CKB from Scratch to Production

Today marks a significant milestone: CKB, our new Code Intelligence Engine, is fully operational on production! This wasn't just a simple library integration; it was an end-to-end journey, from spinning up Docker containers to building a dynamic UI, all while navigating a maze of technical challenges.

Our goal was ambitious: integrate CKB to provide deep code insights – architecture, hotspots, security audits, and dead code detection – directly within our platform. This meant not just running the analysis, but making it accessible, understandable, and actionable for our users. We tackled this in two main phases, both now proudly deployed.

Phase 1: The Core Integration – Laying the Foundation

The first phase was all about connecting the dots, establishing the backbone for CKB's operations. This involved a series of meticulous steps to ensure CKB could live and breathe within our existing infrastructure.

Dockerization: We containerized CKB, building ghcr.io/simplyliz/ckb:latest from its source, ensuring it was a self-contained, deployable unit on our production server.
Database Schema: ProjectCkbIndex model was added to Prisma, complete with Row-Level Security (RLS) and back-relations, providing a robust storage layer for CKB's output.
Service Layer: Our ckb-client.ts service became the central hub, a docker exec wrapper exposing 13 distinct analysis functions to interact with the CKB container.
API Exposure: Thirteen tRPC procedures in ckb.ts provided the API endpoints, orchestrating asynchronous, fire-and-forget indexing processes.
Workflow Integration: We wired the {{ckb}} template into our workflow engine via ckb-content-loader.ts, allowing CKB insights to be dynamically injected into reports and summaries.
Automated Indexing: A critical UX improvement: CKB analysis automatically kicks off whenever a new repository link is added or updated in projects.ts.
Testing: With 17 passing tests (10 client, 7 content loader), we had a solid foundation of confidence in our backend.

Phase 2: Bringing Code Intelligence to Life – The UI

With the backend humming, Phase 2 focused on delivering these powerful insights to our users through an intuitive interface.

Dedicated Code Intelligence Tab: code-intelligence-tab.tsx became the home for all CKB data, featuring engaging overview cards and detailed sections for deep dives.
Sidebar Integration: A new "Code Intel" tab was added to the Development group in the project sidebar, making it easily discoverable.
Dynamic Progress Bar: To keep users informed during analysis, we implemented a 5-step progress bar (Clone → Architecture → Hotspots → Audit → Dead Code), providing real-time feedback.
Real-time Polling: A useEffect and useState driven polling mechanism refreshes data every 2 seconds during processing, ensuring users always see the latest analysis status.

Navigating the Treacherous Waters: Lessons Learned from the Trenches

No integration of this scale comes without its share of head-scratching moments. Our "Pain Log" quickly transformed into a "Lessons Learned" diary, highlighting critical insights gained during testing and deployment.

1. The Elusive CLI Command & Workdir

The Challenge: Initially, we tried docker exec nyxcore-ckb-1 ckb architecture --repo /data/repos/xxx --format json. This failed with "unknown command 'architecture'" and an unrecognized --repo flag.
The Lesson: External tools often have subtle CLI syntax differences. Always double-check the exact command names (architecture vs. arch) and how they expect repository paths (via --repo flag vs. setting the WORKDIR for docker exec).
The Fix: We switched to docker exec -w /data/repos/xxx nyxcore-ckb-1 ckb arch --format json, setting the working directory directly in the docker exec command.

2. Docker-in-Docker (ish) Permissions

The Challenge: Our main application container needed to run docker exec against the CKB container. This led to "docker: executable not found in $PATH" and then "permission denied" errors on /var/run/docker.sock.
The Lesson: When your application container needs to interact with the Docker daemon, it requires both the docker-cli installed within the app container and proper permissions to access the Docker socket. Simply mounting the socket isn't enough; the user running the app inside the container needs to be part of the docker group.
The Fix: We added docker-cli to our app's Dockerfile and, crucially, mounted /var/run/docker.sock:ro while adding group_add: "988" (the Docker GID on the host) to the app container's configuration.

3. Robust GitHub Token Resolution

The Challenge: CKB needed to clone private GitHub repositories. Our initial resolveGitHubToken() function only queried the github_tokens table (for individual user tokens), failing for tenant-level API keys stored in api_keys.
The Lesson: Authentication logic needs to be comprehensive, accounting for all possible token sources and implementing intelligent fallbacks. Don't assume one token type fits all scenarios.
The Fix: We implemented a fallback mechanism: first, try github_tokens, and if that fails, query api_keys specifically for provider: "github".

4. Defensive Data Handling for External APIs

The Challenge: CKB's JSON output sometimes varied from our initial assumptions. For instance, hotspots were wrapped in an object { hotspots: [...] } instead of being a raw array, and field names like file, risk, severity were actually filePath, score, riskLevel. Also, numerical values sometimes came back as strings or null.
The Lesson: Always treat data from external APIs with suspicion. Implement strong type guards (Array.isArray()), use Record<string, unknown> for initial parsing, and apply nullish coalescing or explicit type conversions (Number(h.score) || 0) to ensure data integrity.
The Fix: We refactored our data parsing to defensively check for array types, correctly map field names, and cast values to their expected types, providing resilience against minor API output variations.

5. Conditional Polling in React Query

The Challenge: We attempted to use refetchInterval: (query) => { ... } for conditional polling with react-query, but encountered an "(intermediate value) is not iterable" error, suggesting a version incompatibility or misuse.
The Lesson: While powerful, advanced features of libraries can sometimes be tricky or have unexpected behaviors across versions. Sometimes, a simpler, more direct approach using React's core hooks is more reliable.
The Fix: We reverted to a simpler useState(polling) and useEffect to toggle a boolean, then used refetchInterval: polling ? 2000 : false. This achieved the desired conditional polling without fighting the library.

Current State & The Road Ahead

As of today, our production environment (root@46.225.232.35, commit 8af3329) is running strong. The CKB container (nyxcore-ckb-1) is live with CKB v8.1.0, built fresh on the server. The project_ckb_indexes table is populated, and even the background ckb-docs agent is hard at work generating technical and executive summaries.

Our immediate next steps include:

Waiting for the documentation agent to finish its work.
Committing and pushing the generated docs.
Phase 3: Implementing a webhook endpoint for automated re-indexing on GitHub pushes.
Updating our content-loader tests to reflect the actual CKB field names.
Extracting valuable attack scenarios from a specific workflow.
Exploring SCIP index support for even richer symbol analysis, addressing current "INDEX_MISSING" warnings.

This journey has been a testament to iterative development, problem-solving, and the power of a dedicated team. We're excited to see how CKB empowers our users with deeper insights into their codebases!