From Silent Errors to Slashed Tokens: A Workflow Engine Compression Saga

The world of AI-powered applications is exciting, but it comes with its own set of challenges. One of the biggest? Managing the ever-growing context window for Large Language Models (LLMs) and, by extension, the escalating token costs. Our workflow engine, designed to automate complex development tasks, was starting to feel the pinch.

Our solution on paper was elegant: step digest compression. The idea is simple – instead of feeding the entire verbose output of every previous workflow step into the next prompt, we generate a concise "digest" (a summary) for each completed step. This drastically reduces the prompt size, saves tokens, and speeds up inference.

The only problem? It wasn't working. Digests weren't being generated, leading to prompts that were far larger than they needed to be. This past session was all about diving in, fixing the silent failures, and proving end-to-end compression works.

Unmasking the Silent Killers: Where Did Our Digests Go?

The first clue that something was amiss was the sheer size of our prompts. If digests were working, we should have seen a significant reduction. A quick inspection revealed that the digest column in our database was suspiciously empty for most completed steps.

The initial investigation led us to src/server/services/step-digest.ts. It turned out we had a classic case of silent error swallowing. A try...catch block was present, which is good, but the catch block was completely empty. Any failure during digest generation (which typically involves another LLM call) was simply disappearing into the ether, leaving no trace.

typescript

// Before: A silent killer
try {
  // ... digest generation logic ...
} catch (error) {
  // Nothing here! 👻
}

// After: Bringing errors to light
try {
  // ... digest generation logic ...
} catch (error) {
  console.error("Failed to generate step digest:", error);
  // Optional: Add telemetry or more robust error handling
}

Adding a simple console.error immediately brought the underlying Haiku API failures to light. With visibility restored, we could start addressing the actual issues.

Plugging the Gaps: Ensuring All Paths Lead to Compression

Once the errors were visible, it became clear that digest generation was being bypassed in several critical paths within workflow-engine.ts:

Backfilling on Resume: What happens if a workflow is paused and resumed, and some steps completed before our digest fix? They'd still have missing digests. We introduced a backfill loop right after buildChainContext():

typescript

// src/server/services/workflow-engine.ts (~line 585)
// ... after building initial chain context ...

// Backfill digests for any completed steps that might be missing one
for (const step of workflow.steps) {
  if (step.status === 'COMPLETED' && !step.digest) {
    await generateAndSaveStepDigest(step.id); // Our new helper
  }
}

// ... continue with next step execution ...

This ensures that even legacy steps get their digests generated before a new prompt is constructed.

The Alternatives Selection Path: Our engine allows for multiple alternative outputs for certain steps (generateCount > 1). We discovered that steps taking this path were completely bypassing digest generation. This was a significant oversight, as these steps often produce substantial output. We integrated digest generation into this selection logic.
typescript
```
// src/server/services/workflow-engine.ts (~line 652-668)
if (step.generateCount > 1 && selectedAlternativeIndex !== undefined) {
  // ... logic to select an alternative ...
  // Now, generate digest for the selected alternative
  await generateAndSaveStepDigest(step.id);
}
```
General Completion Path: Even in the "normal" completion path, the digest generation's catch block was silent. We added console.error there too, just to be safe.

The Payoff: Tangible Token Savings

With all the fixes in place, it was time for the ultimate test: a full end-to-end workflow run with prompt size measurements. The results were incredibly satisfying:

Backfilled Digests: We manually ran a temporary script to backfill digests for two existing steps:
- Analyze Target Repo: Reduced from 8KB to 4KB
- Design Features: Reduced from 9KB to 4KB
Full Workflow Run (4e369e42):
- Design Features prompt: 15.4KB (This step used a compressed Analyze Target Repo digest, which went from 10.7KB full output down to 3.6KB digest).
- Review prompt: 3.7KB (Already small, but benefiting from earlier compression).
- Extend & Improve prompt: 15.4KB
- Implementation Prompts: 32.2KB (This step specifically requested the .full output for Extend & Improve at 22.9KB, but still benefited from digests for earlier steps, keeping the overall context manageable).

The numbers speak for themselves. We're effectively halving (or more!) the size of many prompt inputs, directly translating to lower API costs and faster response times.

Lessons from the Trenches: Debugging and Workarounds

No debugging session is complete without hitting a few snags. Here's what we learned along the way:

Prisma's `db execute` vs. Raw SQL

Problem: We initially tried using npx prisma db execute to query workflow steps directly. However, it failed to recognize table names. It seems Prisma's db execute expects model names, not actual table names, and the mapping wasn't consistent in our case.

Workaround & Lesson: Sometimes, the simplest tool is the best. We resorted to psql directly.

bash

PGPASSWORD=nyxcore_dev psql -h localhost -U nyxcore -d nyxcore

Crucial Note: Prisma, by default, maps camelCase model fields to camelCase column names in the database, NOT snake_case. When writing raw SQL queries, you must quote these identifiers.

sql

-- This will fail if "workflowId" is the actual column name
SELECT workflowId FROM WorkflowStep;

-- This is correct
SELECT "workflowId", "selectedIndex" FROM "WorkflowStep";

Always quote your camelCase column names in raw SQL when working with Prisma-generated schemas.

Triggering Workflows from the CLI

Problem: We needed a way to repeatedly run workflows for testing without going through the UI. Our SSE workflow endpoint uses authenticateRequest() which relies on auth() from NextAuth, reading browser session cookies. This made direct CLI calls impossible.

Workaround & Lesson: For internal testing and scripting, bypass the UI/auth layer by directly importing and calling the core logic. We created a temporary scripts/run-workflow.ts that imported runWorkflow() directly.

typescript

// scripts/run-workflow.ts (simplified)
import { runWorkflow } from '../src/server/services/workflow-engine';
import { prisma } from '../src/server/db'; // Assuming Prisma client import

async function main() {
  const workflowId = 'YOUR_TEST_WORKFLOW_ID';
  // ... potentially set up some mock context or select alternatives programmatically ...

  console.log(`Running workflow ${workflowId} directly...`);
  await runWorkflow(workflowId, prisma); // Pass necessary dependencies
  console.log('Workflow execution complete.');
}

main().catch(console.error);

This allowed us to iterate quickly on workflow execution logic without the overhead of a full browser interaction.

What's Next?

With the core digest compression system verified and working, our immediate next steps involve further optimization and integration:

Test {{project.wisdom}}: Verify our new project-level consolidation data can be effectively injected into prompts.
Compare Token Costs: Conduct a more rigorous comparison of total token costs before and after digest compression across multiple workflow runs to quantify the exact savings.
Optional Backfill: Consider making the digest backfill loop an optional feature (via environment variable or workflow setting) to avoid unnecessary Haiku calls on every resume for already optimized workflows.

This session was a deep dive into the heart of our workflow engine, tackling a critical performance and cost challenge. By diligently tracking down silent errors and ensuring comprehensive coverage, we've made significant strides in optimizing our AI-powered workflows. The battle against ever-growing context windows continues, but for now, we've won a major skirmish.

json

{"thingsDone":[
  "Fixed silent error swallowing in step digest generation",
  "Implemented digest backfill loop for resumed workflows",
  "Added digest generation for alternative selection paths",
  "Added logging to digest catch blocks across all paths",
  "Ran backfill script for existing steps, demonstrating 50% size reduction",
  "Verified end-to-end compression with a full workflow run, observing significant prompt size reductions",
  "Cleaned up temporary testing scripts",
  "Restored production flags on test workflows"
],"pains":[
  "Prisma `db execute` failing to recognize table names (model vs. table name inconsistency)",
  "Prisma-generated column names are camelCase and require quoting in raw SQL",
  "NextAuth's session-based authentication preventing direct CLI execution of SSE workflow endpoints"
],"successes":[
  "Successfully fixed digest generation, leading to substantial prompt size reduction (e.g., 10.7KB -> 3.6KB for one step)",
  "Verified full end-to-end workflow compression",
  "Developed effective workarounds for Prisma raw SQL and CLI workflow execution challenges"
],"techStack":[
  "TypeScript",
  "Node.js",
  "Prisma",
  "PostgreSQL",
  "Next.js",
  "NextAuth",
  "LLMs (Haiku)",
  "Workflow Engine"
]}

Unmasking the Silent Killers: Where Did Our Digests Go?

Plugging the Gaps: Ensuring All Paths Lead to Compression

The Payoff: Tangible Token Savings

Lessons from the Trenches: Debugging and Workarounds

Prisma's db execute vs. Raw SQL

Triggering Workflows from the CLI

What's Next?

Prisma's `db execute` vs. Raw SQL