The Midnight Marathon to Production: Bringing Nyxcore Live
A late-night push to get Nyxcore's production environment fully live, dealing with database quirks, user data migrations, and critical environment variable gotchas.
It was ~3:30 AM UTC. The kind of hour where the only sounds are the hum of your machine and the frantic tapping of your keyboard. The goal for this session was ambitious: take Nyxcore, my AI workflow orchestration platform, from a collection of local components to a fully live, production-ready service at https://nyxcore.cloud. This wasn't just about flipping a switch; it involved a full database import, fixing authentication issues, and verifying a major new feature: dual-provider AI model integration.
And I'm thrilled to say: Nyxcore is live. All data, all features, dual-provider code complete and deployed. But getting there was a journey through some classic production deployment minefields.
The Mission Brief: From Local to Live
The core objective was clear:
- Production Deployment: Get the entire application stack running stably on the target server.
- Database Import: Migrate all existing development data, including user accounts, workflows, and AI interactions, to the new production PostgreSQL instance.
- Authentication Fixes: Ensure user login and data ownership were seamless and correct.
- Dual-Provider Verification: Confirm that the newly integrated feature, allowing users to compare and select between multiple AI providers (like OpenAI, Anthropic, etc.), was fully operational.
This was the final sprint to make Nyxcore accessible to the world.
The "Done" List: Ticking Off the Milestones
The list of completed tasks tells a story of meticulous effort and problem-solving:
- Initial Data Seeding: Kicked off the production database seeding via a
nohupprocess within a temporary Docker container. This ensures the foundational data is in place without blocking my terminal. - Authentication Resurrected: Configured
RESEND_API_KEYandEMAIL_FROM=noreply@nyxcore.cloudto get email-based authentication fully operational. Without this, no one gets in! - Full Database Import: Executed a complete
pg_dumpfrom my local development environment and imported it into the production PostgreSQL. This included all tables and their associated Row-Level Security (RLS) policies – critical for multi-tenant data isolation. pgvectorRecovery: After a--cleanimport (more on this pain later), theembedding vector(1536)column and thepgvectorextension itself were wiped. I had to manually re-add the extension and the embedding column, then rebuild the HNSW index. A crucial step for AI-powered features.- Persistent Avatars: Ensured all 89 persona avatars were correctly stored and mounted in the
nyxcore_persona_avatarsDocker volume. Small detail, big impact on user experience. - User Data Reassignment: This was a big one. Due to how different OAuth providers handle user IDs, my GitHub-created user (
4891fd9e) owned all the data locally, but the production Resend login created a new user (49ff65a2). I had toUPDATEtheuserIdcolumn across 30 tables to correctly assign all existing data to the now-primary email user. - Critical Key Synchronization: Synced the
ENCRYPTION_KEYacross environments. This was vital to ensure all 5 stored API keys (Anthropic, GitHub, Google, Kimi, OpenAI) could be correctly decrypted and used by the application. - Docker Volume Commit: Committed and deployed the
docker-compose.production.ymlchanges for the persistent avatar volume (commit 3ddc9ea). - Dual-Provider Verification: The star of the show! Verified that the dual-provider implementation was complete, integrated across 18+ files, and type-check clean.
- Final Production Build & Deploy: The ultimate green light.
Lessons from the Trenches: The "Pain Log" Transformed
Not everything went smoothly. Production deployments rarely do. Here are the key lessons learned, often the hard way:
1. Docker Environment Variable Reloading (The Classic)
- The Pitfall: Making changes to
.env.productionand expectingdocker compose restart appto pick them up. - The Reality:
docker compose restartoften just restarts the container with its existing environment. It doesn't re-read the.envfile. - The Takeaway: When environment variables are changed, you need a full recreation:
bashThis forces Docker to tear down and rebuild the container, ensuring it picks up the latest
docker compose up -d --force-recreate appenvfile.
2. pg_dump --clean and Custom Schema Elements
- The Pitfall: Using
pg_dump --cleanfor a full database import, thinking it's a safe "reset." - The Reality:
--cleandrops tables before recreating them. While useful for a pristine import, it wipes out manually added extensions or columns that weren't part of the original schema definition when the tables were created. In my case, it deleted thepgvectorextension and theembedding vector(1536)column. - The Takeaway: Be extremely cautious with
--cleanif your database has custom extensions, specific column types, or indexes that aren't part of your standard schema migration (e.g., if you addedpgvectormanually post-migration). For such cases, it's often safer to:- Import without
--cleanif you're sure the schema is compatible. - Or, be prepared to re-run specific DDL (Data Definition Language) commands after the import:
sql
CREATE EXTENSION IF NOT EXISTS vector; ALTER TABLE your_table ADD COLUMN IF NOT EXISTS embedding vector(1536); -- Re-add HNSW index if wiped CREATE INDEX IF NOT EXISTS your_index_name ON your_table USING hnsw (embedding vector_l2_ops);
- Import without
3. User ID Mismatch Across Auth Providers
- The Pitfall: Assuming user IDs will magically align when migrating data and switching/adding authentication providers.
- The Reality: My local development user, created via GitHub OAuth, had one UUID (
4891fd9e). On production, using Resend for email authentication, a new user was created with a different UUID (49ff65a2), even though it was the "same" me. All my local data was tied to the GitHub ID. - The Takeaway: When dealing with multiple authentication providers or migrating user data, always verify how user IDs are generated and matched. If there's a mismatch, you'll need to perform data reassignment. This involved a series of
UPDATEstatements across 30 tables:sqlThis is a critical step for data integrity and ensuring users see their data.-- Example for one table UPDATE "Workflow" SET "userId" = '49ff65a2' WHERE "userId" = '4891fd9e'; -- Repeat for all tables that have a "userId" column
4. Missing Production Dependencies (The "Works on My Machine" Problem)
- The Pitfall: Assuming all local development tools are present in the production Docker container.
- The Reality: My
blog_gen.pyendpoint failed on production withspawn /app/.venv/bin/python3 ENOENT(Error No Entity). The production Docker image was optimized and didn't include Python, as the core application is built with TypeScript/Node.js. - The Takeaway: Production Docker images should be lean and only include what's absolutely necessary for the application to run. If specific scripts or tools are needed (even for secondary endpoints), they must be explicitly added to the Dockerfile. For now,
blog_gen.pyremains a local/CI-only tool, or I'll need to add Python to the production container in a future iteration.
Active State & The Road Ahead
Nyxcore is now live at https://nyxcore.cloud, running on 46.225.232.35 in /opt/nyxcore. All critical environment variables, including the ENCRYPTION_KEY and RESEND_API_KEY, are correctly configured. The dual-provider feature is fully implemented and awaiting end-to-end testing.
Immediate next steps include:
- End-to-End Dual-Provider Testing: Create a workflow with two
compareProvidersanddualProviderAutoSelectto fully test the new feature on production. - Blog Gen Endpoint Decision: Either add Python to the Dockerfile or disable the
blog_genendpoint in production. - Certbot Auto-Renewal: Set up a cron job for SSL certificate auto-renewal.
- SSHD Configuration: Fix
sshd MaxStartupson the server to prevent frustrating connection drops. - CI/CD Setup: Implement GitHub Actions or a post-push deploy hook for automated deployments.
Reflection
This session was a testament to the fact that getting a complex application fully live is rarely a linear path. It's a dance of technical execution, meticulous data handling, and persistent problem-solving. Each "pain" point was a learning opportunity, hardening the system and my understanding of its intricacies.
Now, with Nyxcore truly alive, the real fun begins: building, iterating, and watching it grow.
{
"thingsDone": [
"Production deployment of Nyxcore at https://nyxcore.cloud",
"Full database import from local development to production PostgreSQL",
"Enabled email authentication via Resend",
"Re-added pgvector extension and embedding columns after import",
"Ensured persistent storage for persona avatars via Docker volume",
"Reassigned all user data across 30 tables to the correct production user ID",
"Synchronized critical API encryption keys across environments",
"Verified full integration and type-check cleanliness of dual-provider feature",
"Successful final production build and deploy"
],
"pains": [
"Docker environment variables not reloading with `docker compose restart`",
"`pg_dump --clean` wiping custom `pgvector` columns and extension",
"User ID mismatch between local GitHub OAuth and production email authentication requiring data reassignment",
"Production Docker container missing Python for `blog_gen.py` endpoint"
],
"successes": [
"Production environment fully operational and live",
"All development data successfully migrated and accessible",
"Authentication system working correctly",
"Dual-provider feature code fully deployed and awaiting final test",
"Learned critical lessons about Docker, PostgreSQL imports, and user data management in production"
],
"techStack": [
"Docker",
"Docker Compose",
"PostgreSQL",
"pgvector",
"TypeScript",
"Node.js",
"Resend (Email API)",
"GitHub (OAuth)",
"OpenAI (API)",
"Anthropic (API)",
"Google (API)",
"Kimi (API)",
"Python (for specific scripts)",
"Linux (Ubuntu server)",
"Nginx (implied for web server)"
]
}