Taming the Monolith: Architecting Finer-Grained LLM Workflows with Fan-Out

It was late, the kind of late where the lines of code start to blur but the satisfaction of a deep dive keeps you going. The goal for this session was ambitious: wrestle our workflow engine into generating truly actionable, specific implementation prompts from a high-level plan. We'd been running into a wall where our AI, for all its brilliance, was producing single, monolithic outputs for complex, multi-point tasks, often targeting the wrong codebase entirely.

The Journey So Far: Context and Recent Wins

Our system is designed to take a set of action points, analyze them, synthesize a plan, and then generate an "implementation prompt" – essentially, a detailed request for an LLM to write code or perform a specific task based on that plan. We recently rolled out some infrastructure updates, including migrating our UI to a ProviderModelPicker (commit a25f6b5), which was a smooth win, proving our underlying architecture could handle significant changes.

We then ran a test workflow (b29285b4-401b-4f50-a1d6-e739ca89b1ef) with 13 steps, including 10 LLM calls and a human review. It completed successfully, but a deep dive into its output revealed some critical flaws in how the final implementation prompt was being generated.

The Core Problem: A Monolithic Monster

Imagine you've given an AI a list of 10 distinct tasks, each requiring specific code changes. You expect 10 separate, detailed instructions for each. What we got instead was a single, sprawling implementation prompt. And to make matters worse, the LLM, when faced with this giant blob, arbitrarily picked one of the 10 action points (in this case, #8, "NLI Metric") and ignored the other nine.

Pain Point 1: One Prompt to Rule Them All (and fail)

Problem: Our group workflow, with 10 action points, produced a single implementation prompt. The LLM then focused on only one action point.
Root Cause: The workflow-engine.ts (specifically lines 2482-2607) was designed to append one implementation prompt step. This works perfectly for single-item workflows but breaks down for groups.

Our existing system already had a powerful mechanism called fanOutConfig, relying on section-splitter.ts to break down large text blobs based on a regex pattern. This was our "aha!" moment. The solution wasn't to reinvent the wheel, but to apply an existing pattern: fan-out the implementation prompt generation. Instead of one monolithic prompt, we'd split the synthesis output by action point section and generate one implementation prompt per section. This means our workflow-engine.ts needs to be smarter about group workflows and leverage fanOutConfig to point to the Synthesis step.

The Language Barrier: Speaking Go When We Meant Python

As if the monolithic output wasn't enough, the prompt was also generating code in the wrong language. Our action points explicitly referenced Python files like ipcha/score.py and ipcha/sanitize.py. Yet, the implementation prompt was producing Go code, complete with references to internal/audit/ and cmd/ckb/ – patterns from our internal CodeMCP Go project.

Pain Point 2: The Project Context Conundrum

Problem: Action points implied Python, but the LLM generated Go code.
Root Cause: Our project.wisdom (a context injection mechanism) was providing strong Go-specific patterns. The LLM, given conflicting signals, prioritized the stronger, more explicit code patterns over the implicit language in the action point descriptions.

The Fix: We need to teach the LLM to derive its target language and stack directly from the action point's file references. This means modifying implementation-prompt-generator.ts to scan for .py, .go, .ts (and similar) file extensions in the action point descriptions and inject an explicit target-context instruction into the system prompt. "Generate code for a Python project," not just "generate code."

Workflow Ironies: Merges, Dependencies, and Pragmatism

Beyond these critical issues, a deeper look at the workflow revealed some interesting quirks and opportunities for refinement.

Pain Point 3: Analysis Says Merge, Steps Say Separate

Problem: Our Group Analysis correctly identified that action points #4 and #7 should be merged, but the per-item plans (Steps 4 and 7) were still generated separately.
Root Cause: The group-prompt-builder.ts was generating one step per action point, irrespective of the analysis output.
Impact: Redundant token usage and potentially conflicting plans, though the Synthesis step did a good job of correcting and merging them.
Lesson Learned: By fan-out from the synthesis step (which already handles merges), we can auto-resolve this issue. The synthesis output is the authoritative, merged plan.

Pain Point 4: Dependency Ordering Ignored

Problem: Group Analysis clearly stated "action point #8 before #1," but the steps executed sequentially 1 through 10.
Impact: Low. Because per-item plans are largely independent documents and the Synthesis step ultimately handles the correct ordering, this was deemed an acceptable "quirk." We'll accept this for now; synthesis is our authoritative ordering.

Pain Point 5: Overestimated Resources

Problem: The Synthesis step estimated 22-26 engineer-weeks for what looked like a solo researcher's TODO list.
Impact: Cosmetic. It doesn't affect the quality of the implementation prompt or the generated code.
Lesson Learned: This is likely an issue of persona tuning within the LLM. While not critical, it's something to keep in mind for future prompt refinement.

The Path Forward: Immediate Next Steps

The session concluded with a clear roadmap. We know exactly where to apply our efforts:

Refactor Implementation Prompt Generation: Modify src/server/services/workflow-engine.ts to create a fan-out implementation prompt step for group workflows. This step will leverage fanOutConfig to point to the Synthesis step and split its output by action point sections.
Smart Language Detection: Update src/server/services/implementation-prompt-generator.ts to detect target languages (e.g., .py, .go, .ts) from action point descriptions and inject explicit language context into the system prompt.
Validate: Create and test a new group workflow to confirm that fan-out produces N distinct implementation prompts, each targeting the correct language.
Cleanup: Remove a deprecated discussions.availableProviders tRPC procedure.
Housekeeping: Address a minor TypeScript diagnostic (_ExpandedPreview declared but never read).

This late-night deep dive was incredibly fruitful. We've identified critical architectural gaps in our LLM workflow, understood their root causes, and, most importantly, mapped out precise, actionable solutions. Leveraging existing fan-out mechanisms and injecting explicit context into our prompts will be key to unlocking truly granular and accurate AI-driven development. The journey to a perfectly orchestrated AI workflow continues!

json

{"thingsDone":["Migrated ProviderModelPicker","Verified workflow completion","Performed deep consistency analysis of workflow input->steps->output","Explored full implementation prompt generation pipeline and fan-out mechanism"],"pains":["Implementation prompt covered only 1/10 action points","Implementation prompt targeted wrong project language","Group Analysis merges not enforced in steps","Dependency ordering ignored in step execution","Resource estimates overdimensioned"],"successes":["Identified critical issues and root causes","Mapped out precise solutions leveraging existing fan-out infrastructure","Gained full understanding of implementation prompt generation pipeline","Confirmed Synthesis step corrects some issues (merges, dependencies)"],"techStack":["TypeScript","Node.js","Prisma","LLM (Gemini-2.5-pro, Claude-Sonnet-4-6)","Workflow Automation","Prompt Engineering"]}