From Midnight Fixes to Future Architectures: A Full-Stack Sprint
A deep dive into a late-night development session, covering critical bug fixes, comprehensive documentation, and the architectural design of a new site crawler for our Axiom RAG system.
The clock had long passed midnight, but the hum of the server and the glow of the monitor were my only companions. These are the development sessions that truly test your mettle – starting with a handful of critical bugs and culminating in the design and planning of a brand new feature. Last night was one such sprint, a testament to focused problem-solving and meticulous planning.
By the time the early hours rolled around, all critical issues were resolved, deployed, and a clear roadmap for our new Axiom RAG site crawler was laid out. Here's a look at what went down.
Squashing Bugs: The Midnight Oil Fixes
First, the immediate fires. Two particularly thorny bugs were causing headaches, one a UI crash and the other a production-only deployment issue.
Challenge 1: The Elusive Constellation Crash
Our "Neural Constellation" board, a visual representation of insights, was occasionally crashing when users clicked on points. The dreaded Cannot read properties of undefined (reading 'length') error pointed directly to point.tags.
The Deep Dive:
The DetailPanel.tsx component expected point.tags to always be an array. However, our SQL query in memory.ts was retrieving workflow_insights.tags directly from the database. A classic pitfall: when no tags were present, the database column was returning NULL, not an empty array. The frontend, blissfully unaware, tried to access .length on null, leading to a crash.
The Fix: A two-pronged attack was needed for robustness:
- Database Layer: Modified the SQL query to explicitly handle
NULLvalues usingCOALESCE:COALESCE(tags, ARRAY[]::text[]). This ensures the database always returns an array, even if empty. (Commita305cad) - Frontend Guard: Added a defensive check in
DetailPanel.tsx(point.tags && point.tags.length) to prevent issues if, by some unforeseen circumstance,tagsstill ended up being null or undefined.
This small but critical fix ensures our constellation board remains stable and user-friendly.
Challenge 2: PDF Parsing in Docker - A Next.js Conundrum
Next up was a baffling production bug: our PDF worker, crucial for parsing documents, was failing in the Dockerized environment with Cannot find module '/app/.next/server/chunks/pdf.worker.mjs'. It worked fine locally, but not in production.
The Deep Dive:
This is a common headache with Next.js standalone builds and complex dependencies like pdfjs-dist. Next.js bundles server components, but it has specific rules about how it handles worker files and external packages. In our case, pdf.mjs was bundled, but the critical pdf.worker.mjs was being left behind, not copied into the final Docker image's server chunks. The module resolution was failing to find the worker file at runtime.
The Fix:
After some digging, the solution involved two key adjustments to how Next.js and Docker handle pdfjs-dist:
- Next.js Configuration: Explicitly told Next.js to treat
pdf-parseandpdfjs-distas external packages for server components. This prevents Next.js from trying to bundle them in a way that breaks their internal worker file resolution.javascript// next.config.mjs serverComponentsExternalPackages: ["pdf-parse", "pdfjs-dist"] - Dockerfile Copy: Manually copied the entire
node_modules/pdfjs-distdirectory from the builder stage into the final production image. This ensures all necessary files, including the worker, are present at the expected path.dockerfileThis placed the worker file at# Dockerfile excerpt COPY --from=builder /app/node_modules/pdfjs-dist ./node_modules/pdfjs-dist/app/node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs, wherepdfjs-distcould correctly locate it. (Commit9c21488)
With these fixes, our PDF parsing is now robust in production, a critical component for our document intelligence features.
Beyond the Code: Architecting the Future
With the immediate fires out, the focus shifted to a new, exciting feature: a site crawler to feed our Axiom RAG system with fresh, targeted web content.
Comprehensive Documentation: The Unsung Hero
Before diving into new code, I took a significant detour to write PhD-level documentation for our existing neural constellation and code review tools. This wasn't just a quick README – it covered:
- The underlying UMAP mathematics for dimensionality reduction.
- Mermaid charts visualizing the architecture and data flow.
- Detailed explanations of visual encoding choices.
- Full architectural breakdowns for both the Constellation Board and the Code Review Tool.
This docs/neural-constellation-and-code-review.md is invaluable for onboarding new team members and ensuring a deep understanding of our complex systems. It's a critical investment in future maintainability and scalability.
Designing the Axiom Site Crawler
The core of the new work was designing the site crawler. This involved brainstorming, user approval, and defining key parameters to ensure it's effective and responsible:
- Scoping: Path-prefix based crawling to stay within defined boundaries (e.g.,
https://example.com/blog/). - Rate Limiting: A conservative 1 request per second to be polite to target servers.
- Page Limit: Max 200 pages per crawl job to prevent runaway processes.
- Content Extraction: Utilizing
@mozilla/readabilityfor clean text extraction and CSS selectors for specific elements. - Progress Tracking: Server-Sent Events (SSE) for real-time progress updates to the UI.
- Output: One document per crawled page, optimized for our RAG system.
The Implementation Plan: 8 Steps to Success
Once the design was approved, I laid out a detailed implementation plan in docs/plans/2026-03-11-site-crawler-implementation.md. This breaks down the feature into manageable tasks, ensuring a structured approach:
- Add
CrawlJobmodel to Prisma schema. - Install
@mozilla/readabilityandlinkedom(for DOM parsing in Node.js). - Implement
site-crawler-service.tswith core crawling logic and tests. - Create an SSE endpoint for crawl progress updates.
- Develop tRPC procedures (
startCrawl,getCrawlJob) for API interaction. - Build the Crawl UI within the existing
AxiomTabon the project page. - Update Dockerfile with production dependencies for the new libraries.
- Write integration tests and deploy.
Both the design doc and the implementation plan are now committed (079f65c, 91a9308), setting the stage for the next phase of development.
Conclusion
This session was a microcosm of full-stack development: troubleshooting frontend bugs, wrestling with Docker and Next.js bundling, meticulously documenting complex systems, and architecting new features from the ground up. All the fixes are now live in production, and the path forward for the site crawler is clear. It's a satisfying feeling to leave a session with both stability restored and innovation planned.
Now, for a bit of rest before Task 1 of the crawler plan begins!