From White Screens to Gemini Streams: A Day in the Life of an LLM App Dev

Ever have one of those development sessions where you tackle a mix of frustrating bugs and exciting new features? That was my morning. Kicking off around 06:30 UTC, the goal was clear: squash a particularly nasty white screen bug, get our discussion retry flow working flawlessly, and finally bring Google Gemini into our growing family of LLM providers. After a few hours of focused work, the dev server's purring, typechecks are clean, and a satisfying list of 'done' items is staring back at me. Let's dive into the trenches of what went down.

The Bug Hunt: Taming the White Screen of Death

The dreaded white screen. It's every developer's nightmare, especially when it's elusive. Our persona selection in the workflow (workflows/[id]/page.tsx) was occasionally throwing a blank page, offering no clues, no console errors – just emptiness. My initial attempts at static analysis and code inspection felt like searching for a needle in a haystack made of other needles. I knew roughly where the issue was, but the exact cause remained hidden.

Lesson Learned: Embrace the ErrorBoundary (and don't trust stale caches!)

When direct inspection fails, you need a safety net. My first, and most crucial, step was to implement a global ErrorBoundary component (src/components/error-boundary.tsx) wrapping our main dashboard content in src/app/(dashboard)/layout.tsx. This immediately turned the 'white screen of death' into a 'graceful error message with a stack trace,' which is infinitely more useful for debugging.

tsx

// src/components/error-boundary.tsx (simplified for illustration)
import React, { Component, ErrorInfo, ReactNode } from 'react';

interface Props {
  children?: ReactNode;
}

interface State {
  hasError: boolean;
  error?: Error;
}

class ErrorBoundary extends Component<Props, State> {
  public state: State = { hasError: false };

  public static getDerivedStateFromError(error: Error): State {
    // Update state so the next render will show the fallback UI.
    return { hasError: true, error };
  }

  public componentDidCatch(error: Error, errorInfo: ErrorInfo) {
    console.error("Uncaught error:", error, errorInfo);
    // In a real app, you'd log this to a service like Sentry or Bugsnag
  }

  public render() {
    if (this.state.hasError) {
      return (
        <div className="flex flex-col items-center justify-center h-full p-4 text-center">
          <h1 className="text-2xl font-bold text-red-600">Oops! Something went wrong.</h1>
          <p className="mt-2 text-gray-700">Please try refreshing. If the issue persists, contact support.</p>
          {this.state.error && (
            <pre className="mt-4 p-2 bg-gray-100 rounded text-sm text-left max-w-lg overflow-auto">
              {this.state.error.message}
            </pre>
          )}
        </div>
      );
    }
    return this.props.children;
  }
}

export default ErrorBoundary;

With the ErrorBoundary in place, the problem became clearer. It wasn't one single issue, but a confluence of factors:

A duplicate refetch() call in an inline onSuccess handler (around line 937 in workflows/[id]/page.tsx) was likely creating a race condition, as a global onSuccess (line 182) already handled it. Removing the redundant call helped stabilize the component.
Missing null guards (?? []) on step.compareProviders and step.comparePersonas were potential culprits for undefined access when data wasn't fully loaded or was unexpectedly empty.
Crucially, a stale .next cache was hiding an ambiguous SQL column error from a previous session. Clearing the cache and restarting the dev server was key to resolving this underlying issue, which the ErrorBoundary then surfaced.

Takeaway: Never underestimate the power of a good ErrorBoundary for debugging. It transforms silent failures into actionable errors. Also, when chasing phantom bugs, remember to clear your build caches (.next in Next.js) – sometimes the problem isn't in your current code, but in artifacts from previous runs or even database schema changes that haven't propagated correctly.

Rethinking Retry: The SSE Auto-Pilot Problem

Next up was a seemingly simple feature: 'retry discussion with another provider.' Users could switch LLMs mid-conversation if they weren't happy with the output. However, this flow was silently failing. The discussion would immediately mark itself as 'done' without generating new content.

Lesson Learned: Pay Attention to API Flags and State Management

The root cause here was subtle but critical. Our Server-Sent Events (SSE) connection, used for streaming LLM responses, was being re-established without a crucial auto=1 flag. Our backend processDiscussion() function expects the last message in a new round to be from the user to initiate a fresh turn. Without auto=1, the service saw the previous assistant messages, assumed the round was complete, and returned 'done' immediately, effectively short-circuiting the retry.

The fix was straightforward: ensure setIsAutoRound(true) is called before incrementing the sseKey (which triggers the SSE reconnection) in the retry onSuccess callback. This injects the auto=1 flag into the SSE URL, signaling the backend to start a new, 'auto-generated' round properly.

typescript

// discussions/[id]/page.tsx (conceptual snippet around line 584)
const retryDiscussionMutation = useMutation({
  // ... configuration for the mutation ...
  onSuccess: () => {
    // CRITICAL: Ensure we signal an auto-round before reconnecting SSE
    setIsAutoRound(true); 
    setSseKey(prev => prev + 1); // Triggers SSE reconnection with updated state
  },
  // ... other handlers ...
});

Takeaway: When dealing with stateful APIs and streaming protocols like SSE, every parameter matters. A missing flag can completely alter the server's interpretation of your request, leading to unexpected behavior. Meticulous attention to detail in state management and API contracts is paramount. Always verify that all necessary parameters are being passed, especially when re-establishing connections or re-triggering processes.

Feature Spotlight: Welcoming Google Gemini to the LLM Family

With the bugs tamed, it was time for some feature fun: integrating Google Gemini. We use an adapter pattern for our LLM providers, which makes adding new models relatively straightforward, but each LLM has its quirks.

Our src/server/services/llm/adapters/google.ts file went from a stub to a full-fledged implementation:

Completion: Handled via Gemini's generateContent endpoint for single-turn responses.
Streaming: Utilized streamGenerateContent?alt=sse for real-time, token-by-token responses, providing a dynamic user experience.
Role Mapping: Gemini uses model for assistant turns, so we mapped our internal assistant role accordingly to maintain consistency across providers.
Message Merging: A unique Gemini requirement is merging consecutive messages from the same role into a single parts array. Our adapter now handles this automatically.
User-First: Ensured the first message in any conversation always originates from the user, another Gemini best practice.
System Prompt: Passed via systemInstruction, which is separate from the main contents array, allowing for clear separation of general instructions.
Token Usage: Extracted from usageMetadata for accurate billing, rate limiting, and analytics.
Default Model: Set to gemini-2.0-flash for a good balance of speed and capability, with the option to configure other models.

This integration highlights the value of a well-designed adapter pattern. While each LLM API has its nuances (message formats, streaming methods, role conventions), the adapter abstracts these away, allowing the rest of our application to interact with a consistent interface, simplifying future integrations.

The "Done" List & QA Confirmations

Beyond these core tasks, several other items were confirmed fixed and working correctly by QA, bringing a great sense of accomplishment:

Persona portraits now display as expected on both overview and detail pages.
Project Notes tab CRUD operations are fully functional.
The sidebar heartbeat animation is pulsing beautifully.
The Analytics dashboard Memory Intelligence panel is populating data.
Mermaid diagrams in the Docs tab are rendering perfectly.
Workflow creation with the project selector and compare persona labels is smooth.

Looking Ahead: What's Next on the Horizon?

While this session wraps up a significant chunk of work, the journey continues. Immediate next steps involve thoroughly testing the new Google Gemini integration (users need to configure their BYOK API key in the admin vault), validating the discussion retry fix, and confirming the persona selection white screen is permanently banished. After that, it's onto implementing the Ollama provider (src/server/services/llm/adapters/ollama.ts is still a stub!), refining RLS policies for our project_notes table, and a general cleanup of our .gitignore and safeEnqueue audits.

Conclusion

It's moments like these – wrestling with complex bugs, architecting new features, and seeing everything click into place – that make software development so rewarding. Each challenge is a learning opportunity, reinforcing the importance of robust error handling, meticulous API interaction, and a well-structured codebase. Here's to shipping more great software!

json

{"thingsDone":["Fixed persona selection white screen","Implemented Google Gemini LLM provider","Fixed discussion retry flow","Confirmed various UI/UX elements via QA"],"pains":["Elusive white screen error in persona selection","Discussion retry failing due to missing SSE flag"],"successes":["Implemented global ErrorBoundary","Identified and resolved multiple contributing factors to white screen","Successfully integrated Google Gemini API with custom adapter","Fixed SSE reconnection logic for retry feature"],"techStack":["Next.js","React","TypeScript","TanStack Query","Server-Sent Events (SSE)","Google Gemini API","LLM Adapter Pattern","Error Handling"]}