Building Smarter, More Inclusive AI Workflows: A Deep Dive into Expert Teams
We've reached a significant milestone in refining our AI workflow engine, focusing on creating more professional, gender-neutral 'expert teams' and ensuring robust end-to-end execution. Discover how we're making AI collaboration more precise and inclusive.
In the rapidly evolving landscape of AI-powered development, creating systems that are not only powerful but also precise, robust, and inclusive is paramount. Our latest development sprint centered on a crucial aspect of our AI workflow engine: the "expert teams" that guide our Large Language Models (LLMs) through complex tasks. This session marked the culmination of several weeks of focused effort, bringing us to a complete and verified solution.
The core idea behind our "expert teams" is to simulate a collaborative environment for the LLM. Instead of a single, monolithic prompt, we define a team of specialized "agents" with distinct roles and expertise. The LLM then leverages this team, assigning parts of a problem to the most relevant experts, leading to more structured, accurate, and context-aware outputs.
The Evolution of Our Expert Teams: Precision Meets Inclusivity
Our primary goal for this final session was to refine these expert team templates, moving towards a more professional and inclusive standard. Previously, our templates included some creative, female-coded titles. While imaginative, we recognized the importance of promoting gender neutrality and standard professional roles to enhance clarity, reduce potential biases, and better align with real-world project teams.
Here's how we transformed our expert team definitions:
-
Gender-Neutrality and Standard Professional Roles: We meticulously updated the expert examples in
src/lib/constants.ts. Instead of creative titles, we now feature a diverse set of gender-neutral names paired with standard, recognizable professional roles. This not only fosters inclusivity but also provides clearer guidance to the LLM on the specific expertise each "team member" brings.For instance, in our
extensionPrompttemplate, you'll now find:typescript// src/lib/constants.ts (excerpt from extensionPrompt examples) // ... // - Alex Chen — Senior Plugin Architect // - Jordan Rivera — API Integration Lead // - Sam Nakamura — Test Automation Engineer // ...Similarly, our
deepPromptandsecPrompts(for security-focused tasks) received similar updates:typescript// src/lib/constants.ts (excerpt from deepPrompt examples) // ... // - Taylor Kim — Senior Full-Stack Engineer // - Robin Andersen — Database Architect // - Jamie Okafor — UX Engineer // - Quinn Reyes — DevOps Lead // ... // src/lib/constants.ts (excerpt from secPrompts examples) // ... // - Morgan Lee — Application Security Engineer // - Riley Tanaka — Cryptography Specialist // - Casey Okoye — Auth & IAM Lead // ... -
Refined Instructions: We updated the field label from
**Name & Title**to**Name & Role**across all templates, reinforcing the focus on functional expertise. Crucially, we removed the instruction "Give each expert a creative, unconventional title" from all templates. This subtle but significant change ensures the LLM adheres to the professional role guidelines we've established.
End-to-End Verification: Putting the New System to the Test
With the templates updated, the next critical step was to verify everything end-to-end with a new workflow. We ran a dedicated "Expert Team v2" workflow (ID 8cf4402b-e450-4624-a5d2-fb7c33ad1c79) designed for a Kubernetes CLI project.
The results were precisely what we aimed for:
- The LLM successfully generated a diverse and relevant expert team for the Kubernetes CLI project, including roles like Taylor Kim (Go/CLI), Jordan Chen (K8s Platform), Alex Rivera (TUI/Systems), Sam Okafor (AI Integration), and Morgan Liu (DevOps Reliability).
- Each prompt within the workflow was properly assigned to the relevant expert(s), demonstrating the system's ability to interpret context and distribute tasks effectively based on the new, refined roles.
- The entire 3-step workflow completed successfully in just 208.2 seconds, confirming the efficiency of our updated engine.
This successful run, alongside another "Expert Team Test" workflow, validated that our changes not only improved the quality and inclusivity of our expert team definitions but also seamlessly integrated into our existing workflow execution engine.
Broader Enhancements: A Holistic Improvement
This final session also capped off a series of broader improvements across the application, enhancing the overall developer and user experience:
- Streamlined Output Display: We replaced our previous
parsePromptSections()andPromptSectionCardcomponents with a full inlineMarkdownRenderer. Now, all completed step outputs are beautifully rendered in markdown, making them much easier to read and digest. - Enhanced Output Control: A new toolbar for all completed step outputs provides convenient options:
downloadMarkdown(),Copy,Edit, andRetry. This significantly improves the usability of our workflow results. - Explicit Expert Team Assembly: To ensure the LLM always correctly initializes the expert team, we added "Step 0: Assemble the Expert Team" to our
extensionPrompt,deepPrompt, andsecPrompts. This explicit instruction at the beginning of each workflow guides the LLM to set up its internal "team" before tackling the core problem. - Consistent System Prompts: All three
systemPromptfields were updated to consistently mention expert team assembly, reinforcing this crucial first step.
Lessons Learned: Navigating the Development Path
Development sessions are rarely without their challenges. Overcoming these "pains" provides invaluable lessons that strengthen our understanding and our codebase. Here are some key insights from this sprint:
- Playwright Execution Context: We initially tried running Playwright tests from a
/tmp/directory, only to find that it must be executed from the project root to correctly resolvenode_modules. A classic pathing gotcha! - Environment Variable Nuances: A small but significant detail: our project uses
AUTH_SECRETfor authentication, notNEXTAUTH_SECRET. Double-checking environment variable names specific to the project's configuration is always a good practice. - Asynchronous Workflow Execution with SSE: Early attempts to poll for workflow status after a
startmutation proved inefficient. We quickly realized that our engine leverages anAsyncGenerator, meaning the correct approach is to consume the Server-Sent Events (SSE) endpoint (/api/v1/events/workflows/[id]) to drive and monitor execution in real-time. This provides a much more responsive and efficient user experience. - Zod Schema Validation for API Inputs: When creating a mutation, we initially passed
inputas a simple string. However, our Zod schema expectedz.record(z.string()), requiring the input to be an object like{ text: "..." }. Adhering strictly to API schema definitions is vital for robust data handling.
It's also worth noting a pre-existing TypeScript error in discussions/[id]/page.tsx:139 regarding a badge variant ("outline" not assignable). While not introduced by our changes, it's on the radar for future cleanup.
The Road Ahead
With these significant improvements deployed, our journey continues. We've identified several immediate next steps to further enhance the system:
- Clean up any stale workflows that might be stuck in a running state.
- Test the alternatives selection flow end-to-end, ensuring users can generate multiple solutions and select the best one.
- Update our
estimateWorkflowCostto accurately account for thegenerateCountmultiplier, providing transparent cost estimates. - Test prompt editing on a pending workflow, empowering users to refine their instructions before execution.
- Consider implementing a Table of Contents (TOC) or navigation aid for extremely long implementation prompt outputs (10k+ tokens), improving readability and scannability.
Conclusion
This development session marks a significant step forward in our mission to build sophisticated, user-friendly, and ethically responsible AI development tools. By refining our expert team templates for gender neutrality and professional roles, we're not only enhancing the precision of our LLM outputs but also fostering a more inclusive environment. Coupled with robust workflow verification and continuous improvements to the user experience, we're excited about the future of AI-powered collaboration.