Giving Our AI a Brain: Shipping the Workflow Memory System End-to-End

Just past 2 AM, the commit bb7b1c8 landed, marking the completion of a significant chunk of work: our end-to-end project-workflow memory system. This wasn't just about adding a feature; it was about fundamentally enhancing how our AI-powered workflows learn, retain, and apply knowledge across projects. For any developer building intelligent systems, the challenge of giving an AI "memory" is a familiar one, and today, we shipped a solid iteration.

The goal was clear: implement and verify the entire memory flow. This included the UI for selecting memories (MemoryPicker), the dialog for saving new insights (SaveInsightsDialog), the critical {{memory}} template injection into LLM prompts, better context management, and even a dedicated script for pulling memory data. I'm thrilled to report: it's all done, and fully verified.

Let's dive into the details of what we built, the inevitable bumps along the road, and what's next.

The Journey to a Smarter Workflow: What We Built

Bringing this system to life involved several interconnected pieces, each playing a vital role in creating a cohesive memory experience.

1. The MemoryPicker: Your Gateway to Past Insights

At the heart of the system is the MemoryPicker component (src/components/workflow/memory-picker.tsx). This is where users can browse, search, and filter through previously saved insights relevant to their projects. We built it with:

Search functionality: Quickly find specific memories.
Category filters: Organize insights by type or theme.
Severity badges: Visually flag critical or high-impact insights.
Expandable details: Dive deeper into an insight without cluttering the view.
{{memory}} preview: See exactly how the selected insight will be formatted when injected into an LLM prompt. This is crucial for prompt engineering.

This component queries our memory.listInsights tRPC endpoint, making it fast and responsive. Its integration into new/page.tsx now allows us to wire selected memoryIds directly into our create mutation for new workflows.

2. Saving the Wisdom: The SaveInsightsDialog

Capturing new knowledge is just as important as retrieving old. We consolidated our saveInsights logic, moving from a duplicated workflows.ts router to a single, robust memory.saveInsights endpoint. The workflows/[id]/page.tsx now explicitly calls trpc.memory.saveInsights with stepLabel and projectId, ensuring every insight is properly attributed and contextualized.

3. Injecting Intelligence: The `{{memory}}` Template

This is where the magic happens. The {{memory}} placeholder is the bridge between our curated insights and the LLM. When a workflow step uses this template, our system dynamically injects the selected memories directly into the LLM's prompt.

E2E Verification: We put this to the test with a dedicated "Memory Injection Test" workflow. We selected 5 insights, created a single step using {{memory}}, and let the LLM do its thing. The result? The LLM received the injected insights and responded with an accurate, severity-tagged summary. All this in a snappy 3.6 seconds, costing us a mere $0.0035. This was the ultimate proof point for the entire system.

4. Context on Demand: Collapsible Sections

To prevent our workflow UI from becoming overwhelming with too much information, we introduced a CollapsibleSection component. Now, Consolidations, Personas, Docs, and Memory sections are all collapsed by default, with a badge indicating the number of selected items within each. This significantly improves the user experience, allowing focus when needed and easy access to detailed context.

5. Developer Experience: Session Checkpoints & Memory Pull

For developers working with this system, managing memory data is key. We've committed 15 .memory/letter_*.md files as session checkpoints, essentially snapshots of our "memory" database at different points. To make working with this data easier, we built a new scripts/memory-pull.sh script. This handy tool fetches the .memory/ directory from our remote repository without a full merge, and even includes a --watch mode for polling. It's a small but mighty improvement for local development and data synchronization.

Lessons from the Trenches: The "Pain Log" Transformed

No complex system is built without its share of head-scratching moments. Our "Pain Log" quickly became a "Lessons Learned" section.

Lesson 1: Beware the Undefined Field (SaveInsightsDialog Bug)

The Problem: After a review step, clicking "Approve & Continue" should have brought up the SaveInsightsDialog. Instead, it skipped right to the next step, leaving valuable insights unsaved.
The Root Cause: Our extractKeyPoints() function, which parses LLM responses into potential insights, didn't explicitly set an action field on these key points by default. Our SaveInsightsDialog filter was checking for kp.action === "keep", which, for an undefined action field, always evaluated to false. All key points were being filtered out!
The Fix & Takeaway: We updated the filter to be more robust: !kp.action || kp.action === "keep" || kp.action === "edit". This ensures that key points without an explicit action are still considered for saving.
typescript
```
// Before (buggy)
// keyPoints.filter(kp => kp.action === "keep")

// After (fixed) - in workflows/[id]/page.tsx and save-insights-dialog.tsx
keyPoints.filter(kp => !kp.action || kp.action === "keep" || kp.action === "edit")
```
Takeaway: Always validate your data structures and defensively handle undefined or null values, especially when filtering or processing user-generated (or AI-generated) content. Assume nothing about default field values.

Lesson 2: The Double-Click Dilemma (Duplicate Saves)

The Problem: During testing, I clicked the "Save insights" button multiple times in quick succession.
The Result: Our workflow_insights table ended up with 30 records, when there should have only been 10 (3x dupes!). I had to manually clean these via SQL.
The Fix & Takeaway: This is a classic race condition. Immediate Next Step: We need to add a duplicate-save guard. This could be disabling the button immediately after the first click or implementing a mutation-level deduplication strategy. Takeaway: For any action that modifies data, especially involving network requests, implement client-side safeguards (like button disabling/loading states) and consider server-side idempotency to prevent accidental duplicate submissions.

Lesson 3: Zsh's Reserved Variables (Bash Scripting Fun)

The Problem: While writing the memory-pull.sh script, I tried to use status as a variable name in a while true; do status=$(...) loop.
The Failure: Zsh, my shell of choice, threw a read-only variable: status error.
The Workaround & Takeaway: Simply changing the variable name to step_status resolved the issue. Takeaway: Be aware of shell-specific reserved keywords and variables. What works in Bash might not work in Zsh, and vice-versa. A quick search can save you a lot of head-scratching.

What's Next: The Road Ahead

With the core memory system complete, our immediate focus shifts to refinement and expansion:

Duplicate-Save Guard: Implement the aforementioned safeguard on the SaveInsightsDialog to prevent multiple submissions.
Phase 2: Vector Search with pgvector: This is a big one. We'll recreate our Docker Postgres instance with the pgvector/pgvector:pg16 image, install the extension, add an embedding column to our workflow_insights table, and enable true vector similarity search. This will unlock much more intelligent memory retrieval.
Project-Scoped Filtering: Enhance the MemoryPicker to filter insights specifically by projectId, ensuring users only see memories relevant to their current work context.
Template Integration: Consider adding {{memory}} to built-in step templates where it makes sense, streamlining the creation of memory-aware workflows.
Cleanup: Tidy up stale .log files from previous mini-RAG experiments.

Conclusion

Shipping the end-to-end project-workflow memory system is a huge win. We've laid the foundation for an AI that doesn't just process information, but learns and adapts over time, making our workflows significantly smarter and more efficient. The journey was filled with challenges, but each one provided valuable lessons that will make our system more robust.

I'm incredibly excited about the potential this unlocks, especially as we integrate pgvector for advanced semantic search. The future of intelligent, memory-aware workflows is here, and we're just getting started.

json

{
  "thingsDone": [
    "MemoryPicker UI with search, filters, badges, and preview",
    "MemoryPicker integration into new workflow creation",
    "Consolidated saveInsights endpoint (trpc.memory.saveInsights)",
    "SaveInsightsDialog bug fix for `action` field handling",
    "Collapsible context sections (Consolidations, Personas, Docs, Memory)",
    "Session checkpoints and memory-pull script (`scripts/memory-pull.sh`)",
    "End-to-end verification of `{{memory}}` template injection into LLM prompts"
  ],
  "pains": [
    "SaveInsightsDialog not appearing due to incorrect filter logic (undefined `action` field)",
    "Duplicate records created when clicking 'Save insights' multiple times",
    "Zsh `read-only variable: status` error when using `status` in bash scripts"
  ],
  "successes": [
    "Achieved full E2E flow verification for the memory system",
    "Successful LLM injection test with accurate, severity-tagged summary",
    "Improved UX with collapsible context sections",
    "Created useful developer tooling for memory management (`memory-pull.sh`)",
    "Learned valuable lessons about data validation and mutation handling"
  ],
  "techStack": [
    "TypeScript",
    "Next.js",
    "tRPC",
    "React",
    "PostgreSQL",
    "Docker",
    "Bash/Zsh",
    "LLMs (AI)"
  ]
}