Building Smarter AI Workflows: Expert Teams, Polished UI, and Key Learnings from Our Latest Dev Sprint

In the fast-evolving world of AI-powered applications, building robust, intuitive, and truly intelligent systems is a continuous journey. Our recent development sprint focused on several critical areas of our workflow engine: enhancing the power of "expert teams," significantly upgrading our user interface, and ironing out some tricky architectural kinks.

This session was all about refinement and user experience, ensuring our AI workflows aren't just powerful, but also accessible and delightful to interact with.

Assembling the Dream Team: The Evolution of Expert Agents

One of the most exciting features we've been developing is the concept of "expert teams." Imagine not just one large language model tackling a problem, but a team of specialized AI agents, each contributing their unique expertise. This mirrors real-world collaboration, leading to more nuanced, accurate, and comprehensive outputs.

Our primary goal for this session was to finalize the expert team templates. This involved a crucial decision: switching from gendered to gender-neutral agents with standard professional titles. Why? Because inclusivity and broad applicability are paramount. While creative titles can be fun, standard professional roles ensure clarity and consistency, allowing the LLM to better understand the expected contribution of each "expert." This change reflects our commitment to building universally effective and unbiased AI tools.

We integrated "Step 0: Assemble the Expert Team" into our core prompt templates (extensionPrompt, deepPrompt, secPrompts), and added an "Assigns Expert(s)" field to each implementation prompt. The end-to-end verification was a success: we ran an "Expert Team Test" workflow, and the LLM successfully generated a team of five domain experts perfectly matched to a real-time collaboration editor tech stack. This confirms the core mechanic is solid!

A Polished Experience: Elevating the Workflow UI

Beyond the AI's internal intelligence, the user's interaction with the system is vital. We made significant strides in improving the workflow UI:

Full Markdown Rendering: We replaced our PromptSectionCard component with a full inline MarkdownRenderer. This means users now see the AI's output exactly as it's intended, with rich formatting, code blocks, and lists, making the information far more digestible and professional.
Enhanced Output Toolbar: For every completed step output, we added a powerful toolbar featuring .md download, Copy, Edit, and Retry options. This gives users unprecedented control over the workflow's progression and outputs, enabling easy iteration, saving, and integration into other tools. To facilitate this, we introduced a downloadMarkdown(content, filename) helper.

These UI enhancements are small but mighty, drastically improving the usability and flexibility of our workflow engine.

Navigating the Development Labyrinth: Lessons Learned

No development sprint is without its challenges. Overcoming these hurdles often provides the most valuable insights. Here are a few "gotchas" and lessons we picked up:

Playwright Environment Setup: We learned the hard way that Playwright tests, particularly when dealing with node_modules, must be run from the project root. Attempting to execute them from a temporary directory (/tmp/) leads to dependency resolution issues. A simple but critical reminder about environment contexts!
Environment Variable Naming: A classic developer pitfall! We initially tried to use process.env.NEXTAUTH_SECRET for our authentication secret, only to discover the correct environment variable name was AUTH_SECRET. Always double-check library-specific environment variable conventions.
Asynchronous Workflow Engine Architecture: This was a significant learning curve. We initially tried to start a workflow via a tRPC start mutation and then poll for its status. However, our workflow engine, being an AsyncGenerator, only truly runs when its Server-Sent Events (SSE) endpoint (/api/v1/events/workflows/[id]) is actively consumed. The start mutation merely initializes it; the SSE connection drives its execution. This architectural detail is crucial for anyone building real-time, event-driven systems.
tRPC Input Type Mismatch: When passing input to a tRPC create mutation, we initially tried a plain string. Zod, however, expected z.record(z.string()). The solution was to pass the input as an object: { text: "..." }. This highlights the importance of adhering to precise type definitions, especially with robust schema validation libraries like Zod.

These challenges, though frustrating in the moment, have strengthened our understanding of the system and will lead to more resilient development practices.

What's Next on Our Roadmap?

With the core expert team logic refined and the UI significantly improved, our immediate next steps include:

Template Verification: Running new test workflows to ensure our gender-neutral, professional expert team templates are working perfectly.
Alternatives Selection Flow: Testing the end-to-end flow for generating multiple alternatives (generateCount: 3) to provide users with more choices.
Cost Estimation Update: Updating our estimateWorkflowCost function to accurately account for the generateCount multiplier.
Cleanup: Addressing any lingering issues, including a stale workflow instance (db096fa7) that was left running without an SSE consumer.

This session marked a significant leap forward in making our AI workflow engine more intelligent, user-friendly, and robust. We're excited about the possibilities these enhancements unlock and look forward to sharing more updates soon!