nyxcore-systems
4 min read

Navigating the Superadmin Labyrinth: E2E Testing, Safe Migrations, and pgvector Gotchas

A deep dive into setting up robust E2E tests for critical superadmin flows, tackling a tricky Prisma/pgvector migration challenge, and ensuring production safety.

E2E TestingPlaywrightPrismapgvectorNext.jsSuperadminDatabase MigrationsTypeScript

Building robust applications often means confronting complex challenges head-on. Recently, our focus was squarely on fortifying our superadmin flows with end-to-end (E2E) tests, ensuring a seamless tenant switching experience, and — perhaps most critically — devising a bulletproof strategy for database migrations involving pgvector. This session was a deep dive into laying the groundwork for these crucial features, moving from design proposals to concrete implementation plans.

The Mission: Fortifying Critical Flows

Our primary goals for this sprint were clear:

  1. Implement E2E tests for superadmin flows: These are the most powerful and potentially destructive paths in our application, demanding rigorous testing.
  2. Create a safe production Prisma migration script for pgvector: Our use of pgvector with Prisma introduced a unique challenge during schema updates.
  3. Test the tenant switcher UX: Ensuring superadmins can seamlessly switch between tenant contexts and that data isolation remains perfect.

We'd just wrapped up the design phase, and with the user's final review pending, it was time to translate blueprints into actionable code.

Peeking Under the Hood: Our E2E & Superadmin Landscape

Before diving into new features, a thorough exploration of existing structures was essential. We delved into:

  • E2E Test Setup: Our existing Playwright configuration, the tests/e2e/helpers/auth.ts helper for injecting authenticated sessions, and the structure of our four existing spec files provided a solid foundation. The goal was to extend this for superadmin scenarios.
  • Superadmin & Tenant Switching: We examined src/components/layout/tenant-switcher.tsx for the UI, and src/server/trpc/routers/superadmin.ts and src/server/trpc/routers/teams.ts for the backend logic. Understanding the invitation service and how our auth.ts JWT callback handles roles was key to simulating superadmin access.

The Database Dragon: Prisma and pgvector Migrations

This is where things got particularly interesting, and frankly, a bit hairy. Our application heavily leverages pgvector for embedding storage, specifically in workflow_insights and document_chunks. While pgvector works beautifully for storing vector embeddings and even supports HNSW indexes for efficient similarity search, its interaction with Prisma's migration system presented a significant hurdle.

The Problem: prisma db push and the "Unsupported" Vector Type

Our development workflow traditionally relies on prisma db push for schema updates. It's fast, convenient, and usually gets the job done. However, when pgvector columns like embedding vector(1536) are present in the database, prisma db push sees them as Unsupported types in the Prisma schema (because Prisma doesn't natively recognize vector as a scalar type for migrations).

The terrifying consequence? prisma db push on production, when it detects these "unsupported" columns, wants to DROP them.

sql
ALTER TABLE "workflow_insights" DROP COLUMN "embedding";
ALTER TABLE "document_chunks" DROP COLUMN "embedding";

CRITICAL WARNING: Running prisma db push --accept-data-loss on production in this scenario would be catastrophic, irrevocably destroying all our valuable vector embedding data. This is a hard "never do."

The Solution: A Surgical Migration Script

To safely manage schema changes without losing vector data, we designed a custom, multi-step migration strategy:

  1. Generate SQL Diff: Use prisma migrate diff to generate the raw SQL needed to bring the database schema in sync with the Prisma schema.
    bash
    prisma migrate diff \
      --from-schema-datasource prisma-schema-prod.prisma \
      --to-schema-datasource prisma-schema-dev.prisma \
      --script > migration.sql
    
  2. Filter Out Dangerous Operations: Crucially, we'll parse migration.sql to identify and remove any DROP COLUMN "embedding" statements. This is the surgical strike that prevents data loss.
  3. Apply Safe SQL: Execute the filtered SQL script using psql directly against the production database.
  4. Re-run RLS: Our Row-Level Security (RLS) policies are defined in rls.sql. Since prisma db push bypasses Prisma's migration engine, it doesn't automatically reapply RLS. After manual schema changes, we must explicitly re-run rls.sql to ensure security policies are intact.

This scripts/db-migrate-safe.sh will become our new standard for production database updates involving pgvector. It's a bit more manual, but it guarantees data integrity.

Lessons Learned: Edge Runtime Gotchas

A quick but important reminder surfaced from a previous commit (54933e0): our src/server/auth.ts file runs in the Edge Runtime. This is fantastic for performance, but it comes with a strict limitation: never import Node.js-only modules (like crypto, fs, path) into this file. The Edge Runtime environment doesn't have access to these Node-specific APIs, leading to runtime errors. Keeping this in mind helps prevent subtle production bugs.

The Path Forward: Bringing It All Together

With the design approved and the migration strategy ironed out, our immediate next steps are clear:

  1. Get final user approval on the 3-task design proposal.
  2. Create tests/e2e/superadmin.spec.ts: Our dedicated E2E test suite for superadmin functionality, including tRPC mocking for isolated testing.
  3. Extend tests/e2e/helpers/auth.ts: Add an injectSuperAdminCookie() helper to easily simulate superadmin logins in our tests.
  4. Develop scripts/db-migrate-safe.sh: Implement the pgvector-safe migration script using prisma migrate diff and SQL filtering.
  5. Run E2E tests: Verify the new superadmin flows and ensure our tenant switcher UX is robust (npm run test:e2e).
  6. Confirm Tenant Switcher UX coverage: Ensure our superadmin E2E tests adequately cover the switching mechanism and data isolation across tenants.

This session was a crucial step in enhancing the reliability and safety of our application. By proactively addressing potential pitfalls like the pgvector migration challenge and implementing comprehensive E2E tests, we're building a more resilient and trustworthy system. Onwards to implementation!