From Crash to Crawl: A Full-Stack Dev Session's Journey Through Bugs, Docs, and New Features
Join me on a deep dive into a recent development session where we tackled critical production bugs, meticulously documented complex systems, and laid the groundwork for a powerful new web crawler for our Axiom RAG.
It was late, or perhaps very early, when the last commit for the day—or rather, the session—landed. The clock read around 1:30 AM on March 11th, 2026. This wasn't just a bug-fixing sprint or a feature push; it was one of those all-encompassing sessions where you touch every part of the stack, from database queries to Docker configurations, from detailed documentation to high-level system design.
The mission: stabilize our constellation visualization, fix a sneaky PDF parsing bug in production, write some much-needed PhD-level documentation, and design the first iteration of a site crawler for our Axiom RAG system. By the time the session wrapped, all bugs were squashed, docs were comprehensive, and the crawler design was approved with an 8-task implementation plan ready to go.
Here's a look at the journey, the problems encountered, and the lessons learned.
The Bug Hunt: Squashing Critters in Production
Nothing gets the adrenaline flowing like a production crash. We had two critical ones to tackle.
1. The Constellation Detail Panel Crash: The Null Tags Menace
The Problem: Users were reporting crashes when clicking on points within our constellation visualization. This visualization helps us understand insights from workflows and code reviews, and a crash here was a major blocker.
The Diagnosis: Digging into the logs, the error pointed to point.tags.length in DetailPanel.tsx:123. The point.tags property was unexpectedly null. Our workflow_insights.tags column in the database, while conceptually an array of text, didn't have a DEFAULT '[]' constraint. This meant new entries could (and did) have NULL for tags, leading to a JavaScript error when trying to access .length on null.
The Fix & The Lesson:
This was a classic case of defensive programming missing a spot. We needed to ensure that tags was always an array, both at the database level and in our UI component.
First, the SQL query in memory.ts was updated:
SELECT
-- ... other columns ...
COALESCE(tags, ARRAY[]::text[]) AS tags
FROM workflow_insights;
COALESCE is a life-saver here, substituting NULL values with an empty text array (ARRAY[]::text[]).
Second, in the DetailPanel.tsx component, we added a null guard for extra safety:
// DetailPanel.tsx:123 (conceptual)
{point.tags && point.tags.length > 0 && (
<div>
{/* Render tags */}
</div>
)}
(Commit a305cad)
Lesson Learned: Always consider NULL values when querying databases, even for array-like columns. Defensive programming in the UI layer is also crucial, but ideally, your data model prevents such states. COALESCE is your friend.
2. PDF Parsing in Docker: The Next.js Bundling Beast
The Problem: Our PDF parsing worker, critical for extracting content from documents, was failing in our production Docker environment with a cryptic error: Cannot find module '/app/.next/server/chunks/pdf.worker.mjs'. It worked fine locally, which is always a red flag for bundling or environment issues.
The Diagnosis: We use Next.js with a standalone Docker build. Next.js is smart about bundling, but sometimes too smart. While it bundled pdf.mjs (the main PDF.js library) into the server chunks, it completely ignored pdf.worker.mjs, which is dynamically loaded by the main library. The path /app/.next/server/chunks/pdf.worker.mjs was a hint that Next.js thought it should be there, but it wasn't copying it.
The Fix & The Lesson:
This required a two-pronged approach, modifying both next.config.mjs and our Dockerfile.
First, we told Next.js to treat pdf-parse and pdfjs-dist as external packages that shouldn't be bundled directly into server components. This forces Next.js to respect their original node_modules structure more closely.
// next.config.mjs
const nextConfig = {
// ... other config ...
experimental: {
serverComponentsExternalPackages: ["pdf-parse", "pdfjs-dist"],
},
};
Second, and crucially, we explicitly copied the entire pdfjs-dist package from the build stage into the final Docker image. This ensures the pdf.worker.mjs file, along with its necessary assets, is present at the expected path relative to node_modules.
# Dockerfile
# ... existing build stage ...
FROM base AS runner
# ... other commands ...
# Copy external packages that Next.js might miss or misplace
COPY --from=builder /app/node_modules/pdfjs-dist ./node_modules/pdfjs-dist
# ... remaining commands ...
After these changes, the worker file was correctly located at /app/node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs and the PDF parsing started humming along.
(Commit 9c21488)
Lesson Learned: When dealing with complex third-party libraries that have dynamic imports (especially worker scripts) in a Next.js standalone Docker build, be prepared for bundling quirks. serverComponentsExternalPackages can help, but sometimes a manual COPY in your Dockerfile is necessary to ensure all necessary assets are present in the final image. Always verify file presence in your Docker container!
Building Bridges with Documentation (and Diagrams!)
With the critical bugs out of the way, it was time to address a different kind of technical debt: documentation. Our "Neural Constellation" and "Code Review Tool" are complex systems involving UMAP, vector embeddings, and intricate visual encodings. Ad-hoc explanations weren't cutting it anymore.
The goal was PhD-level documentation, and that's what was delivered at docs/neural-constellation-and-code-review.md. This wasn't just text; it included:
- UMAP math explanations: Demystifying the dimensionality reduction.
- Mermaid charts: Visualizing the data flow and architectural components.
- Visual encoding details: Explaining how data maps to visual elements.
- Full architecture diagrams: Providing a comprehensive overview of both tools.
This kind of documentation is invaluable for onboarding new team members, debugging complex issues, and ensuring long-term maintainability. It's an investment that always pays off.
(Commit 079f65c)
Charting New Territory: Designing the Axiom RAG Site Crawler
The final major piece of the session was looking forward: designing a new site crawler for our Axiom RAG (Retrieval Augmented Generation) system. The goal is to ingest relevant external web content for our AI agents.
The design process involved brainstorming and user approval, leading to a clear set of requirements and constraints:
- Path-prefix Scoping: Crawl only specific sections of a website (e.g.,
/docs/or/blog/). - Rate Limiting: A polite 1 request/second to avoid overwhelming target servers.
- Page Limits: Max 200 pages per crawl to prevent runaway processes.
- Content Extraction: Utilize
@mozilla/readabilityfor clean article content, augmented with CSS selectors for specific data points. - Progress Feedback: Server-Sent Events (SSE) to provide real-time updates to the UI.
- Data Model: One document per page, stored efficiently for RAG.
This design was then translated into a detailed 8-task implementation plan, laid out in docs/plans/2026-03-11-site-crawler-implementation.md. This plan covers everything from Prisma schema changes and new dependencies (@mozilla/readability, linkedom) to service implementation, SSE endpoints, tRPC procedures, and UI integration.
(Commit 91a9308)
The Path Forward: Immediate Next Steps
With the design approved and the plan written, the next phase is execution. The immediate tasks lined up are:
- [ ] Execute Task 1: Add
CrawlJobmodel to Prisma schema. - [ ] Execute Task 2: Install
@mozilla/readability+linkedomdependencies. - [ ] Execute Task 3: Implement
site-crawler-service.tsand associated tests. - [ ] Execute Task 4: Create an SSE endpoint for crawl progress updates.
- [ ] Execute Task 5: Implement
startCrawlandgetCrawlJobtRPC procedures. - [ ] Execute Task 6: Build the Crawl UI within the existing
AxiomTabcomponent. - [ ] Execute Task 7: Update
Dockerfilefor production dependencies. - [ ] Execute Task 8: Write integration tests and deploy.
All changes, including the fixes and the documentation, are already deployed to production, ensuring stability as we move forward with this exciting new feature.
Conclusion
This session was a microcosm of full-stack development: tackling urgent production issues, investing in robust documentation for complex systems, and meticulously designing new features. Each challenge offered a learning opportunity, reinforcing best practices in database interaction, Docker deployment, and thoughtful system design.
The satisfaction of seeing critical bugs resolved, complex systems clearly documented, and a detailed plan for a powerful new feature in hand is immense. Now, it's time to build!
{
"thingsDone": [
"Fixed constellation DetailPanel crash (null tags from DB)",
"Fixed PDF worker module error in production Docker (Next.js bundling)",
"Wrote comprehensive PhD-level documentation for Neural Constellation and Code Review Tool (UMAP, Mermaid, architecture)",
"Designed site crawler for Axiom RAG (scoping, rate limits, content extraction, SSE)",
"Wrote detailed 8-task implementation plan for site crawler",
"All fixes and docs deployed to production"
],
"pains": [
"Constellation crash due to NULL `tags` array from PostgreSQL, causing `point.tags.length` to fail.",
"PDF parsing failure in Next.js Docker due to `pdf.worker.mjs` not being bundled/copied correctly by Next.js standalone output."
],
"successes": [
"Implemented `COALESCE` in SQL and null guard in TSX for constellation tags.",
"Used `serverComponentsExternalPackages` in `next.config.mjs` and explicit `COPY` in Dockerfile for `pdfjs-dist` to fix PDF worker.",
"Created high-quality, detailed documentation with diagrams for complex systems.",
"Developed a clear, user-approved design and implementation plan for a new site crawler.",
"Successfully deployed all fixes and documentation to production."
],
"techStack": [
"Next.js",
"Docker",
"TypeScript",
"PostgreSQL",
"Prisma",
"tRPC",
"React",
"SQL",
"UMAP",
"Mermaid.js",
"pdf-parse",
"pdfjs-dist",
"@mozilla/readability",
"linkedom",
"Server-Sent Events (SSE)",
"RAG (Retrieval Augmented Generation)"
]
}