nyxcore-systems
4 min read

Beyond Stubs: Our AI Assistant Now Writes Gold-Standard TypeScript Implementations

We tackled the challenge of getting an AI to generate complete, production-ready TypeScript code, moving past generic stubs to match and even exceed our hand-crafted gold standard for a 'rent-a-persona' feature.

AILLMCodeGenerationTypeScriptDevelopmentWorkflowGeminiEngineeringPromptEngineering

The dream of AI-assisted development often conjures images of perfectly formed, production-ready code springing forth from a simple prompt. The reality, however, can sometimes be a little less magical: stubs, partial implementations, or even completely off-topic suggestions. Our recent development sprint was all about closing this gap, pushing our internal AI workflow to generate not just code, but gold-standard, complete TypeScript implementations that truly understood the bigger picture.

The Challenge: Bridging the Vision Gap

Our internal development workflow leverages powerful LLMs to auto-generate implementation prompts – essentially, detailed plans for our developers to build specific features. The goal was ambitious: make these AI-generated prompts indistinguishable from (or even better than) a meticulously hand-crafted one. Specifically, we aimed for complete TypeScript code, with codebase-grounded paths and no frustrating stubs.

However, we hit a roadblock. Previous workflow runs for our new "rent-a-persona" feature kept producing implementation prompts for a billing system. While billing is crucial, it definitely wasn't the core feature we were trying to build!

The root cause became clear: our implementation prompt generator, while excellent at analyzing individual workflow steps, was suffering from a kind of LLM myopia. It saw the trees (individual tasks like "handle authentication" or "manage data"), but it couldn't see the forest – the overarching feature goal of "build a rent-a-persona API." It would analyze supporting infrastructure (like security or audit logs) and infer that the primary task was something generic, rather than the specific, user-facing feature.

The Breakthrough: Injecting Intent

The solution, once identified, felt elegantly simple: we needed to explicitly inject the workflow's goal directly into the implementation prompt generator.

Here's how we did it:

  1. Model Upgrade: First, we ensured our system had access to the most capable models for this task. We added gemini-2.5-pro to our MODEL_CATALOG, recognizing its advanced reasoning capabilities would be crucial.
  2. Explicit Goal Injection: We introduced a new workflowGoal field into our PromptInputParams. This field takes the workflow.name and workflow.description and renders them prominently at the top of the user message to the LLM as a dedicated # FEATURE GOAL section.
  3. Workflow Integration: We hooked this up in our workflow-engine.ts, ensuring that the overall workflow's name and description were passed down to the implementation prompt builder.
  4. Testing & Verification: Naturally, all relevant unit tests were updated to reflect this new workflowGoal parameter, ensuring robustness.

This change meant the LLM now received a clear, unambiguous directive: "Synthesize ALL steps into a single cohesive plan for this specific feature."

The Results: Gold Standard (and Beyond!)

The impact was immediate and dramatic. With workflow ddd599a1, the auto-generated implementation prompt clocked in at a staggering 610 lines of complete, high-quality TypeScript code. This wasn't just a plan; it was a near-complete blueprint covering:

  • A full persona completion API
  • Robust authentication middleware
  • Streaming capabilities
  • Prompt injection defense mechanisms
  • tRPC router integration
  • Dashboard UI considerations

This output not only matched our hand-crafted gold standard reference (which was 477 lines) but exceeded it in completeness and detail. We had moved beyond generic stubs to a truly codebase-grounded, production-ready implementation plan.

Our primary implementation prompt provider for this success was google (gemini-2.5-pro), leveraging its large context window (maxTokens: 16384 for the API call, MAX_TOTAL_CONTEXT: 60000 for input assembly) to process the extensive workflow details.

Lessons Learned: Navigating the Bumps

Even with a breakthrough, development always comes with its share of practical challenges:

  • Context is King: The most significant lesson was the critical importance of providing explicit, high-level context to LLMs. Without the workflowGoal, even the most advanced models can get lost in the weeds. This reinforced the power of thoughtful prompt engineering.
  • Operational Hiccups: During deployment, we encountered a docker container name conflict. A quick docker rm -f followed by docker compose up -d app resolved it, reminding us that even sophisticated AI systems rely on solid DevOps fundamentals.
  • Security First: A critical reminder came when a user's Google API key was inadvertently exposed in chat. This highlighted the continuous need for vigilance and user education on securing sensitive credentials. (Action: user was reminded to rotate their key immediately).
  • Resource Management: Our Anthropic credits were depleted, causing some ancillary services like step digest compression and consistency checks to fail. This underscored the dependency on a balanced multi-model strategy and consistent resource provisioning for all workflow components.

What's Next?

With this major milestone achieved, our immediate next steps include:

  1. Security Action: The user needs to rotate their exposed Google API key.
  2. Real-World Application: We're excited to use the auto-generated implementation prompt from workflow ddd599a1 to actually build out the "rent-a-persona" feature.
  3. Restore Services: Top up Anthropic credits to restore full functionality for our step digest and consistency checks.
  4. Model Expansion (Optional): Explore adding more cutting-edge models like gemini-3-pro-preview and gemini-3.1-pro-preview to our MODEL_CATALOG to further enhance capabilities.

This journey has been a testament to the power of targeted prompt engineering and the continuous refinement of our AI-assisted development workflows. We're excited to continue pushing the boundaries of what's possible, generating not just code, but truly intelligent and complete solutions.