Giving Your Workflows a Brain: Building a Self-Improving Memory System
Dive into the technical journey of building a workflow memory system, enabling LLM-powered insights to persist and inform future tasks. From vector embeddings to SQL injection fixes, here's how we're making workflows smarter.
In the fast-paced world of AI-driven development, we're constantly pushing the boundaries of what our systems can do. One recurring challenge with LLM-powered workflows is their inherent statelessness: brilliant insights generated in one step often fade into oblivion once the task is complete. What if our workflows could learn, remember, and apply past lessons to future challenges?
That's precisely the problem we set out to solve with our new project-workflow memory system. Our goal: to capture valuable insights generated during a workflow, persist them, and then make them selectable as context for future workflows using a simple {{memory}} template variable. Imagine a system that gets smarter with every completed task.
This post details our journey, the architectural decisions, the features we've shipped, and the inevitable bumps (and critical fixes!) along the way.
The Vision: A Self-Learning Workflow
The core idea is simple yet powerful. When an LLM-driven review step identifies a "pain point" or a "strength," we want to capture that specific piece of knowledge. This isn't just about logging; it's about structuring that knowledge so it can be retrieved intelligently later. This "memory" will then serve as a rich context, guiding subsequent LLM prompts and improving the quality and relevance of future outputs.
We've just wrapped up a significant chunk of the backend and the UI for saving these insights. The system is taking shape, and the potential is immense.
Building Blocks of Memory: What We've Shipped
Bringing this vision to life required touching almost every part of our stack. Here's a rundown of the key components we've developed:
1. The WorkflowInsight Model: Structuring Knowledge
At the heart of our memory system is the WorkflowInsight Prisma model. This robust schema is designed to capture granular details about each insight:
- Traceability:
workflowId,stepId,stepLabel,projectId– ensuring we know exactly where an insight came from. - Content:
title,detail,suggestion,category,insightType(e.g.,PAIN,STRENGTH),severity. - Pairing:
pairedInsightIdallows us to link pain points directly to their suggested solutions or vice versa. - Searchability:
tsvectorfor full-text search and apgvectorembedding column (more on this later!) for semantic search.
This model, defined in our Prisma schema, is the foundation upon which all other memory features are built.
2. The Embedding Service: Giving Insights Semantic Meaning
To enable intelligent, semantic search, we need to convert our insights into numerical vectors. Our embedding-service.ts handles this:
- It leverages OpenAI's
text-embedding-3-smallmodel (1536-dimensional vectors). - Supports batch processing for efficiency.
- Crucially,
buildEmbeddingText()structures the input text with prefixes (e.g., "title: [title] detail: [detail]") to provide better context to the embedding model. This ensures relevant parts of the insight are weighted appropriately during vector generation.
3. Insight Persistence: Capturing the Moment
The insight-persistence.ts service is responsible for transforming the raw output of a review step into structured WorkflowInsight records. persistReviewInsights() intelligently maps ReviewKeyPoint[] (the output from our LLM review agent) to new WorkflowInsight entries. It even auto-pairs pain points with strengths if they share a category and originate from the same workflow step, creating those valuable pairedInsightId links.
4. Hybrid Insight Search: Finding What You Need
Retrieving relevant insights is paramount. Our insight-search.ts service implements a powerful hybrid search strategy:
- 70% Vector Search: For semantic relevance, finding insights similar in meaning.
- 30% Text Search: For exact keyword matches using PostgreSQL's
tsvectorcapabilities.
This blend ensures both conceptual and literal relevance. A critical security review during this phase identified and fixed a severe SQL injection vulnerability, transforming $queryRawUnsafe into parameterized Prisma.sql queries. This was a vital lesson learned about raw SQL and input validation.
5. Workflow Insights Loader: Context for LLMs
The workflow-insights.ts service provides the formatted content that eventually gets injected into LLM prompts:
loadMemoryContent(memoryIds)fetches specific insights.loadProjectInsights(projectId)retrieves all insights for a given project.
These insights are then formatted as grouped markdown, ready to be consumed as context.
6. The {{memory}} Template Variable: Seamless Integration
Our workflow-engine.ts now understands the {{memory}} template variable. When encountered in a prompt, the engine parallel-loads the selected memory content and resolves it during resolvePrompt(), seamlessly injecting the gathered insights into the LLM's context.
7. User Interface: Saving What Matters
The SaveInsightsDialog is the user-facing component that allows users to curate which insights are saved. After a review step, a modal appears, presenting a checkbox list of potential ReviewKeyPoints. Users can select/deselect all, edit individual insights, and then "Save & Continue" or "Skip." This gives users control over what enters the system's memory.
8. Infrastructure & Security Hardening
- Docker Upgrade: Switched from
postgres:16-alpinetopgvector/pgvector:pg16to natively support vector embeddings. - Schema & RLS: The
workflow_insightstable is created, complete with Row-Level Security (RLS) and atsvectortrigger for automatic full-text indexing. - Security Review: Beyond the
insight-search.tsfix, tRPC input validation was tightened with enums and regex, ensuring robust data integrity and preventing malicious inputs.
Challenges Faced & Lessons Learned
No complex system is built without its share of hurdles. These were some of our key "pain points" that turned into valuable lessons:
1. The pgvector Conundrum
- The Problem: Initially, trying to add a
vector(1536)column directly to our Prisma schema resulted in anUnsupported("vector(1536)")error. Even after removing it and trying to rundb:push, we hitERROR: type "vector" does not existin our Dockerized PostgreSQL. - The Diagnosis: Our standard
postgres:16-alpineDocker image simply didn't include thepgvectorextension. - The Workaround & Solution: We temporarily removed the
embeddingcolumn from the Prisma schema (planning to add it via raw SQL later). The long-term fix involved upgrading ourdocker-compose.ymlto use thepgvector/pgvector:pg16image. This image comes pre-bundled with thepgvectorextension, simplifying installation. The next container recreation will bring this into effect, allowing us to install the extension and add the column. - Lesson: When integrating novel database features like vector extensions, always verify your underlying database image or installation includes the necessary components. Don't assume standard images have everything.
2. Inline tsx Execution Quirks
- The Problem: Attempting to run
npx tsx -e '...'with inline code containing!characters led to unexpected Zsh glob expansion and esbuild syntax errors. - The Workaround: Instead of inline execution, we found it more reliable to write the script to a temporary file and then execute it via
npx tsx /path/to/file.ts. - Lesson: Inline script execution can be tricky with shell special characters. For anything beyond trivial commands, a dedicated script file is often safer and more readable.
3. The Critical SQL Injection Vulnerability
- The Problem: Our initial implementation of
insight-search.tsused$queryRawUnsafe()with string interpolation for user-supplied filter arrays. This opened a critical SQL injection vector, allowing a malicious actor to manipulate database queries. - The Resolution: A thorough security review agent identified this vulnerability. We immediately refactored the code to use
Prisma.sqltagged templates, which correctly parameterize queries, preventing any direct string interpolation from user input. We also tightened tRPC input validation with enums, regex, and max lengths to add additional layers of defense. - Lesson: Never trust user input, especially when constructing raw SQL queries. Always use parameterized queries or ORM methods that handle escaping for you. Regular security reviews are invaluable for catching such issues before they become critical.
What's Next: Bringing It All Together
We're incredibly excited about the potential of this system. Here's what's on the immediate horizon:
- Test
SaveInsightsDialog: Verify the full flow from review approval to insight saving and database persistence. - Build
MemoryPicker: Develop a UI component for the workflow creation page, allowing users to search, filter, and select specific insights to inject into their new workflows. - Integrate & Test Full Flow: Connect the
MemoryPickerto the workflow creation page and conduct end-to-end tests to ensure{{memory}}resolves correctly in prompts. - Phase 2: Full
pgvectorActivation: Recreate our Docker Postgres with thepgvectorimage, install the extension, add the embedding column, and enable full vector search capabilities. - Router Consolidation: Review and consolidate duplicate
saveInsightsprocedures across our tRPC routers for better maintainability.
This memory system is a significant step towards making our LLM-powered workflows truly intelligent and self-improving. By learning from every interaction, we're building a more robust, efficient, and context-aware platform. Stay tuned for more updates as we continue to evolve!