From Gender-Neutral Agents to Real-time UI: Deep Dive into AI Workflow Development

Welcome back to the dev trenches! This post pulls back the curtain on our third major development session, where we're pushing hard to refine the core of our AI workflow system. The goal? To make our LLM-powered agent teams smarter, more robust, and incredibly user-friendly. Today's focus was a blend of crucial backend logic, UI/UX enhancements, and, of course, the inevitable debugging dance that every developer knows too well.

We're building a system that allows users to define complex, multi-step AI workflows, where each step can leverage a specialized "expert team" assembled on the fly by an LLM. This session was all about making those expert teams truly professional and providing a seamless experience for interacting with the LLM's outputs.

Evolving Our AI Agent Templates: Professionalism Over Personalization

One of the most significant decisions this session revolved around our LLM agent templates. Initially, we experimented with more creative, sometimes gendered, titles for our AI "experts." We quickly realized that while fun, this approach could introduce unintended biases and wasn't aligned with the professional, robust nature we envision for our system.

The Decision: Switch to entirely gender-neutral agents with standard professional titles.

Why it matters:

Neutrality & Professionalism: Ensures our AI system maintains a professional demeanor and avoids any implicit biases.
Scalability: Standardized titles make it easier to manage and extend our agent ecosystem.
User Preference: Feedback indicated a preference for clear, professional roles.

These template updates are happening in src/lib/constants.ts, ensuring consistency across our extensionPrompt, deepPrompt, and secPrompts. Now, every implementation prompt explicitly includes an "Assigns Expert(s)" field, guiding the LLM to assemble the right team for the job.

Elevating the User Experience: Seeing is Believing (and Downloading!)

A major chunk of this session was dedicated to improving how users interact with the LLM's outputs. Raw JSON or plain text isn't ideal, especially for multi-step workflows.

1. Full Markdown Rendering for LLM Outputs: We've replaced our custom parsePromptSections() and PromptSectionCard components with a full inline MarkdownRenderer. This means every output from an LLM step is now beautifully formatted, making it much easier to read and understand complex responses.

typescript

// Before (conceptual, simplified for clarity)
// <PromptSectionCard content={parsePromptSections(output)} />

// After (conceptual, simplified)
import { MarkdownRenderer } from '@/components/MarkdownRenderer'; 
// ...
<MarkdownRenderer content={stepOutput.content} />

2. A Robust Output Toolbar: To complement the improved rendering, we've added a comprehensive toolbar to every completed step output. Users can now:

Download as .md: Get a clean markdown file of the output.
Copy: Quickly copy the content to their clipboard.
Edit: Modify the output if needed (for iterative refinement).
Retry: Rerun the specific step if the output isn't satisfactory.

This was enabled by a new downloadMarkdown(content, filename) helper, making it trivial to export generated content. This significantly enhances the utility and iterability of our workflow system.

3. Expert Team Generation - Verified! We put our new expert team generation logic to the test with a dedicated "Expert Team Test" workflow. The results were fantastic: the LLM successfully produced 5 domain experts perfectly matched to our project's real-time collaborative editor tech stack. This confirms our structured prompting for expert assignment is working as intended.

Navigating the Minefield: Lessons Learned from the "Pain Log"

Every development session has its share of head-scratching moments. Here’s what we learned the hard way, so you don't have to:

Lesson 1: Playwright Pathing Pitfalls

The Problem: Playwright tests were failing when run from a subdirectory like /tmp/.
The Attempt: Trying various relative paths or temporary workarounds.
The Solution: Playwright (and many Node.js tools) often expect to be run from the project root to correctly resolve node_modules and other project dependencies.
Takeaway: Always run your test suite from the project's root directory unless explicitly configured otherwise. It saves a lot of dependency resolution headaches.

Lesson 2: Environment Variable Gotchas: AUTH_SECRET vs. NEXTAUTH_SECRET

The Problem: Our authentication setup wasn't working, despite setting process.env.NEXTAUTH_SECRET.
The Attempt: Debugging auth library configurations.
The Solution: The environment variable required by our authjs (NextAuth.js) setup was actually AUTH_SECRET, not NEXTAUTH_SECRET.
Takeaway: Double-check the exact environment variable names required by your authentication library. A quick glance at the official documentation can save hours of debugging.

Lesson 3: The Asynchronous Dance: Triggering Workflows with SSE

The Problem: We were calling a tRPC start mutation for a workflow, but it would get stuck in a "running" state without actually processing any steps.
The Attempt: Polling the workflow status, checking server logs.
The Solution: Our workflow engine is built using an AsyncGenerator. This means it only executes when its Server-Sent Events (SSE) endpoint (/api/v1/events/workflows/[id]) is actively being consumed by a client. Calling start merely initializes it; the actual work begins when a client connects to listen for events.
Takeaway: If your LLM workflow engine leverages AsyncGenerator and SSE, remember that the workflow won't genuinely progress until a client establishes an SSE connection to consume its events. Don't just start, listen.

Lesson 4: Type Safety Tangles: tRPC Mutation Input

The Problem: Passing a raw string input to a tRPC create mutation resulted in a Zod validation error.
The Attempt: Assuming tRPC would implicitly handle string inputs.
The Solution: Our Zod schema for the mutation expected z.record(z.string()) – an object where keys and values are strings. The correct input format was { text: "..." }, not just "...".
Takeaway: Always match your tRPC mutation input to the exact Zod schema definition. If it expects an object, provide an object, even if it only contains one field.

Immediate Next Steps

With these refinements and lessons under our belt, our immediate focus shifts to:

Finalizing the expert team templates to ensure full gender-neutrality and standard professional titles.
Running a new end-to-end test workflow to verify these updated templates.
Testing the "alternatives selection flow" (where generateCount: 3 produces multiple options) from start to finish.
Updating our estimateWorkflowCost function to accurately account for the generateCount multiplier.
Cleaning up any stale workflows (like db096fa7) that might be lingering from our SSE debugging.

Conclusion

This session was a testament to the iterative nature of building complex AI systems. From refining our LLM agent personas to streamlining the user interface and battling tricky integration issues, every step brings us closer to a robust, intuitive, and powerful workflow automation platform. The journey continues, and we're excited to share more insights as we build it out!

json

{
  "thingsDone": [
    "Replaced prompt section parsing with full Markdown rendering for LLM outputs.",
    "Added .md download, copy, edit, and retry toolbar to all completed workflow steps.",
    "Integrated 'Step 0: Assemble the Expert Team' into all core prompt templates.",
    "Added 'Assigns Expert(s)' field to implementation prompts.",
    "Verified end-to-end expert team generation with a specific test workflow.",
    "Initiated switch from gendered/creative agent titles to gender-neutral/professional titles."
  ],
  "pains": [
    "Playwright failing due to incorrect working directory.",
    "Incorrect environment variable name for Auth.js (`NEXTAUTH_SECRET` vs `AUTH_SECRET`).",
    "Workflow engine (AsyncGenerator) not executing without an active SSE consumer.",
    "tRPC mutation input type mismatch (string vs. z.record(z.string())).",
    "Pre-existing TypeScript error in `discussions/[id]/page.tsx`."
  ],
  "successes": [
    "Full markdown rendering significantly improves UI/UX.",
    "Output toolbar provides critical user interaction/iteration capabilities.",
    "Expert team generation works as expected, producing domain-specific experts.",
    "Clear lessons learned from debugging critical integration points."
  ],
  "techStack": [
    "Next.js",
    "tRPC",
    "PostgreSQL",
    "LLMs (Large Language Models)",
    "SSE (Server-Sent Events)",
    "Zod",
    "Playwright",
    "TypeScript",
    "MarkdownRenderer",
    "Auth.js (NextAuth.js)"
  ]
}