From 18 Plans to Production-Ready: Building a Smart Project Onboarding Workflow

Building complex features is rarely a straight line. It's a journey of architectural decisions, integration challenges, and the occasional head-scratching moment. Recently, I wrapped up a significant chunk of work for Nyxcore Systems: bringing to life a robust, intelligent project onboarding experience. Our goal was ambitious: take 18 distinct project-onboarding plans – ranging from database schema to real-time UI – and weave them into a seamless, automated workflow.

After a focused sprint, I'm thrilled to report that all 18 plans are implemented, committed across three logical batches, and currently awaiting review in PR #134. This post dives into the "how," the "what," and crucially, the invaluable "lessons learned" along the way.

The Mission: A Smarter Project Onboarding

Imagine a system where you point it to a GitHub repository, and it automatically detects the tech stack, scans the files, generates a project summary, and even drafts an initial README.md. That's the vision behind our new project onboarding workflow. It's about reducing friction, providing immediate value, and setting the stage for deeper analysis.

This wasn't just about building a UI; it involved deep backend integrations, real-time communication, and leveraging the power of Large Language Models (LLMs) to understand and summarize codebases.

The Journey: Building It Piece by Piece

To manage the complexity, I broke the work down into three distinct batches, each building upon the last.

Batch 1: Laying the Foundation (Commit `2277151`)

The first step was establishing the core data structures and initial UI hooks.

Data Models & Security: We introduced ProjectOnboarding and ProjectFile Prisma models, immediately securing them with Row-Level Security (RLS) policies. This ensures that project data is isolated and accessible only to authorized users from day one.
UI Integration Point: A new "Letters" tab on the project detail page provided an early home for onboarding-related information.
Demo Data: A DEMO_PROJECTS catalog was crucial for local development and testing, populating the UI with sample data quickly.
RAG Sanitizer: With LLMs involved, security is paramount. We implemented a src/lib/security/rag-sanitizer.ts with 9 dedicated tests to ensure that any data fed into or returned from the LLM is clean and safe.

Batch 2: The Core Workflow Engine (Commit `23b6680`)

This batch brought the onboarding process to life, integrating external services and setting up the analysis pipeline.

The Onboarding Wizard: The user-facing heart of the system, located at /dashboard/projects/new, guides users through three intuitive steps: Source (connecting to GitHub), Configure (setting up analysis options), and Analyze (triggering the pipeline).
GitHub Integration: We implemented mutations for GitHub repository creation and branch listing, allowing users to connect their code directly. This involved careful handling of GitHub API tokens.
File Scanning & Tech Detection: Two critical services emerged: project-scanner-service.ts to crawl repository files, and tech-detector.ts (with 18 tests) to identify the underlying technologies of a project (e.g., React, Node.js, Python).
Real-time Feedback: An SSE (Server-Sent Events) endpoint at /api/v1/events/onboarding/[projectId] provides real-time updates to the user as their project progresses through the analysis pipeline.
Analysis Orchestration: The analysis pipeline orchestrator now coordinates the scanner, tech detector, and future analysis steps, complete with an analysis watchdog to prevent runaway processes (15-minute timeout).

Batch 3: User Experience & LLM Magic (Commit `e44f5d7`)

The final batch focused on enhancing the user experience and integrating the LLM's generative capabilities.

Code Exploration: getFileTree and the FilesTab component provide a navigable file explorer within the UI, giving users insight into their project's structure.
LLM-Powered Summaries: This is where the "smart" part comes in. We integrated services for LLM project summary generation and LLM README skeleton generation, leveraging the LLM to understand and articulate the essence of a codebase.
UI Components: A custom markdown editor component with templates and a branch picker component were added for user interaction and configuration.
Project Configuration: Mutations like setDefaultBranch and a new summary field on the Project model allow users to fine-tune their project settings and store LLM-generated content.

The entire feature lives on the feat/project-onboarding branch, culminating in PR #134. All 27 new tests (9 for RAG sanitizer, 18 for tech detector) are passing, and TypeScript compiles clean – a testament to a robust development process.

Navigating the Minefield: Lessons from the "Pain Log"

No significant feature build goes without its challenges. These moments of friction often provide the most valuable lessons.

Lesson 1: LLM-Generated Plans are a Starting Point, Not Gospel

The Problem: I experimented with using LLM-generated plan file paths and patterns directly. The results were... educational. The LLM suggested paths that didn't match our actual src/server/trpc/routers/ structure (it preferred src/server/api/routers/), used incorrect authentication patterns (ctx.session.user.id instead of our ctx.user.id), and even proposed recreating existing infrastructure like auth middleware, SSE hooks, and rate limiting.

The Insight: LLMs are phenomenal for generating ideas, boilerplate, or even high-level architectural concepts. However, they lack the deep, contextual understanding of your specific codebase's architecture, conventions, and existing utilities. They can't "read" your package.json or your trpc.ts file to understand the nuances of your setup.

The Takeaway: Always validate LLM-generated plans against your actual codebase before implementing. Treat them as intelligent suggestions, not executable blueprints. My workaround involved reading the actual codebase first and adapting every plan to extend existing patterns. This meant correctly using ctx.user.id, leveraging resolveGitHubToken() for BYOK tokens, integrating with our tabs-based InPageSidebar, and chaining existing protectedProcedure, mutationProcedure, and llmMutationProcedure tRPC handlers.

Actionable Tip: Before writing a single line of LLM-suggested code, mentally (or physically) map it to your existing directory structure, authentication flow, and utility functions. If it suggests rebuilding something you already have, ask the LLM to integrate with your existing solution instead.

Lesson 2: Keep Up with Framework Migrations (tRPC v11 / React Query v5)

The Problem: I tried using keepPreviousData in tRPC v11's useQuery options, a common pattern from earlier versions.

The Insight: Frameworks evolve rapidly. What was valid in one version might be deprecated, renamed, or handled implicitly in a newer one. In this case, keepPreviousData is not a valid option in React Query v5 (which tRPC v11 uses internally).

The Takeaway: When upgrading major framework versions, always consult the migration guides and release notes. Trust your linter and TypeScript compiler – they're often the first to tell you when something's amiss. The solution was simply to remove the option, as tRPC v11 handles data caching differently.

Lesson 3: Database Management & Local Dev Setup

The Problem: Running npx prisma db push locally failed with a P1001: Can't reach database server error.

The Insight: This is a classic "facepalm" moment. Database operations require a running database! My local Docker setup for PostgreSQL wasn't active.

The Takeaway: Ensure your local development environment is fully initialized. For database-reliant projects, this often means running npm run docker:up (or its equivalent) before attempting schema migrations. While the schema changes were committed, db push will need to be run on deployment. Document these steps clearly for new team members (and for your future self!).

Looking Ahead: The Road to Production

With PR #134 open, the immediate next steps are clear:

Review and Merge: Get that PR reviewed and merged into main!
Production Deployment: On production, we'll need to run npx prisma db push, npx prisma generate, and apply the updated RLS policies with psql -f prisma/rls.sql.
Wire the Pipeline: Connect the currently stubbed analysis pipeline components (scanner → tech detector → summary generation) to their actual services.
E2E Testing: Implement end-to-end tests for the entire onboarding wizard flow to ensure a robust user experience.
UI Integration: Connect the new MarkdownEditor and BranchPicker components to the project settings UI for full configurability.
BYOK Key Handling: Refine the userId propagation through resolveProvider() calls in summary/readme services to correctly handle personal "Bring Your Own Key" (BYOK) scenarios for LLM interactions.

This project was a fantastic challenge, blending full-stack development with the cutting edge of LLM integration. The lessons learned, especially regarding the nuanced use of LLM-generated code, will undoubtedly shape our future development practices. Onwards to a smarter, more automated future for Nyxcore Systems!

json

{
  "thingsDone": [
    "Implemented 18 project onboarding plans",
    "Created ProjectOnboarding and ProjectFile Prisma models with RLS",
    "Developed an onboarding wizard with 3 steps (Source, Configure, Analyze)",
    "Integrated GitHub API for repo creation and branch listing",
    "Built file scanner and tech detector services",
    "Implemented SSE endpoint for real-time onboarding feedback",
    "Developed analysis pipeline orchestrator and watchdog",
    "Created file explorer and LLM-powered summary/README generation",
    "Built custom Markdown editor and branch picker components",
    "Ensured all 27 new tests pass and TypeScript compiles clean"
  ],
  "pains": [
    "LLM-generated plans had incorrect paths, auth patterns, and suggested recreating existing infrastructure",
    "Using `keepPreviousData` in tRPC v11 / React Query v5 (invalid option)",
    "Failed `prisma db push` locally due to database not running"
  ],
  "successes": [
    "Successfully implemented a complex, multi-faceted feature",
    "Learned to effectively validate and adapt LLM-generated code",
    "Demonstrated robust framework integration (Prisma, tRPC, Next.js)",
    "Established a real-time feedback mechanism with SSE",
    "Built a resilient analysis pipeline with a watchdog",
    "Achieved clean code with passing tests and TypeScript"
  ],
  "techStack": [
    "Prisma",
    "tRPC v11",
    "Next.js",
    "TypeScript",
    "React Query v5",
    "GitHub API",
    "LLM (for generation)",
    "PostgreSQL",
    "SSE (Server-Sent Events)",
    "Docker"
  ]
}