Taming the Crawler, Crafting Mobile: A Day of Debugging & Design Overhaul
A deep dive into a recent development session, covering a critical HTML stripping bug fix, a full mobile-first CSS overhaul, and valuable lessons learned from production challenges.
Every once in a while, a development session comes along that feels like a mini-saga. You start with a clear goal, encounter unexpected twists, learn crucial lessons, and emerge with significant improvements to the codebase and user experience. This past Wednesday was one of those days. We tackled two major initiatives: squashing a stubborn HTML stripping bug in our site crawler and giving our entire dashboard a much-needed mobile-first CSS overhaul.
By the end of the day, both were not just complete, but deployed to production. All 318 tests green, typechecks pristine. Let's unpack the journey.
The Crawler Conundrum: From HTML Soup to Clean Text
Our application leverages a RAG (Retrieval Augmented Generation) system, which means the quality of the documents we feed it is paramount. Recently, we noticed an insidious bug: our site crawler was occasionally ingesting documents with raw HTML tags embedded directly into the text. This led to noisy, unhelpful context for our AI models.
The culprit was twofold:
-
The Fallback Fiasco (
document-processor.ts): OurprocessDocument()function was designed to fetch content from asourceUrlas a fallback. However, it wasn't reliably checking if a local file already existed before attempting this re-fetch. This meant that even if we had a clean local copy, it might be overwritten by a less-than-ideal remote fetch.- The Fix: We introduced a robust
fs.access()check. Now, it first verifies the presence of a local file. Only if that fails does it proceed to re-fetch from thesourceUrl, ensuring we prioritize stable, local content.
- The Fix: We introduced a robust
-
The MimeType Mix-up (
site-crawler-service.ts): A subtle but critical error was found in our crawler'smimeTypehandling. It was incorrectly defaulting totext/htmlwhen it should have beentext/plainfor the parsed document content. This allowed HTML tags to slip through the cracks.- The Fix: A straightforward change from
text/htmltotext/plainin thesite-crawler-service.tsensures our parser correctly interprets and cleans the content before storage.
- The Fix: A straightforward change from
Data Cleanup and Deployment
With the bugs identified and patched, the next step was crucial: cleaning up the tainted data. We identified 153 corrupted BetrVG documents in production that needed to be removed. This led to a brief but intense moment of manual SQL execution. We also needed a new crawl_jobs table to better manage our crawling processes, which also necessitated manual creation due to a separate lesson learned (more on that later!).
The fixes were committed (0957b1b) and swiftly deployed. Our RAG system can now breathe a sigh of relief, knowing it's feeding on clean, unadulterated text.
Embracing Mobile-First: A Dashboard Transformation
With the backend stability restored, it was time to shift gears to the frontend. Our dashboard, while functional on desktop, was showing its age on smaller screens. The goal: a complete mobile-first CSS overhaul.
This wasn't just a quick patch; it was a systematic redesign, starting with a dedicated design document (docs/plans/2026-03-11-mobile-first-css-design.md) and a detailed implementation plan (docs/plans/2026-03-11-mobile-first-css.md) outlining six key tasks. We executed these tasks using what we affectionately call "subagent-driven development" – a highly focused, iterative process where each change builds logically on the last.
Here's a glimpse at the transformative changes:
- Viewport Meta Refinements (
4d43396): We fine-tuned the viewport meta tag, removingmaximumScale:1for better zoom flexibility and addingviewportFit:coverfor edge-to-edge content on modern devices. Crucially, we introduced safe-area padding for mobile navigation and apb-20on the main content area to prevent content from being obscured by fixed footers/navs. - Dynamic Sidebar Layout (
7aee355): The traditional desktop sidebar transformed. On mobile, it's now a horizontally scrollable tab bar, freeing up vertical screen real estate. TheSidebarPageLayoutnow fluidly switches betweenflex-colon mobile andmd:flex-rowon desktop. - Responsive Dialogs (
1fc31a3): Modal dialogs are now truly responsive, adapting their width and padding (w-[calc(100%-2rem)] md:w-full,p-4 md:p-6) and ensuring content is scrollable (max-h-[85vh] overflow-y-auto) on smaller screens. - Adaptive Data Tables (
97861ee): Large data tables are notorious for breaking mobile layouts. We tackled this by selectively hiding less critical columns (Cost, Calls, Duration, Energy) withhidden md:table-celland introducing a tap-to-expand functionality for rows, revealing full details with aChevronDownicon. - Flexible Layout Elements (
07817b5): Across several detail pages (like Ipcha and Projects), we adjusted layout elements to beflex-col gap-2 sm:flex-row, ensuring they stack vertically on mobile and horizontally on wider small screens. - Mobile Navigation Enhancements (
19ea037): Small but impactful tweaks, like replacing "Style" with "Memory" (represented by a Brain icon) for better clarity, and a critical fix for our sidebar sheet component to ensure it correctly overrideshiddenstyles on mobile withclassName="flex w-full".
Lessons Learned: Navigating the Production Minefield
No significant development session is complete without a few bumps in the road. These "pain points" are often the most valuable learning opportunities.
1. SSH Heredocs vs. Docker Exec: A Multi-Statement SQL Snafu
The Challenge: I attempted to pipe a multi-statement SQL script using an SSH heredoc directly into docker exec psql to manage our production database.
The Failure: The heredoc didn't reliably pass through the SSH and Docker exec chain, leading to syntax errors and incomplete commands.
The Workaround: For quick, multi-statement changes, the most reliable approach proved to be a single-line docker exec psql -c "SQL STATEMENT; ANOTHER SQL STATEMENT;" with carefully escaped quotes. For larger scripts, a temporary file copy would be safer.
2. prisma db push on Production: A Dangerous Shortcut
The Challenge: After manually creating the crawl_jobs table, I considered using npx prisma@5.22.0 db push inside the production container to align the schema.
The Failure (and Critical Lesson): NEVER use prisma db push on a production database with existing data, especially if you're using extensions like pgvector! db push is designed for rapid prototyping and will drop and recreate tables, potentially losing data and, critically, dropping custom column types like embedding that pgvector relies on.
The Workaround: Manual CREATE TABLE and ALTER TABLE statements executed via docker exec psql are the safest way to apply schema changes to a production database outside of a robust migration system (which we'll be improving for future changes).
3. CSS Specificity and Mobile Sheet Visibility
The Challenge: When integrating the responsive sidebar component into a mobile Sheet (a slide-out panel), the sidebar became invisible.
The Failure: The sidebar component itself had a hidden md:flex Tailwind CSS class, making it invisible below the md: breakpoint. The Sheet component, by default, wraps its children without overriding their display properties.
The Workaround: We added a className prop to the Sidebar component, allowing the Sheet to explicitly pass className="flex w-full" to override the hidden property when the sidebar is rendered within the mobile sheet. This is a classic example of CSS specificity and component composition gotchas.
The Path Forward
With these significant changes deployed, our application is more robust and user-friendly than ever. The immediate next steps involve:
- Verification: A user re-crawling the BetrVG documents to confirm clean text extraction.
- Real Device Testing: Thoroughly testing the new mobile UI on physical devices to verify the horizontal tab bar, dialogs, table expand, and sidebar sheet behavior.
- Security Review: Considering RLS (Row Level Security) policies for the new
crawl_jobstable. - Code Cleanup: Consolidating duplicate scrollbar-hide utilities in
globals.css. - Polish Pass: A follow-up touch target polish pass to ensure all interactive elements meet the 44px minimum standard for mobile usability.
This session was a testament to the iterative nature of development, where critical bug fixes and major feature enhancements often go hand-in-hand with valuable lessons learned. It's rewarding to see the immediate impact on both the data quality our AI consumes and the user experience our customers enjoy.