Building a Memory System for AI Workflows: From Architecture to Implementation
A deep dive into implementing a project-workflow memory system that allows AI agents to learn from past insights and apply them to new tasks through vector search and intelligent context injection.
Building a Memory System for AI Workflows: From Architecture to Implementation
Late-night development sessions often produce the most interesting breakthroughs. Last night was one of those sessions where everything clicked—we successfully designed and began implementing a sophisticated memory system that allows our AI workflow platform to learn from past project insights and intelligently apply them to new tasks.
The Challenge: Making AI Workflows Smarter
Our AI workflow platform was generating valuable insights during project reviews, but these insights were getting lost in the void. Each new workflow started from scratch, unable to benefit from lessons learned in previous projects. We needed a way to:
- Capture key insights from completed workflow reviews
- Store them with semantic search capabilities
- Make them easily accessible when creating new workflows
- Inject relevant context automatically via template variables
Architecture Decision: Team-Based Problem Solving
Rather than tackling this complex system alone, we spawned a team of three specialized agents:
- Architect: Research existing patterns and design the overall system
- ML Expert: Design the schema and vector search strategy
- UX Developer: Create intuitive interfaces for saving and selecting insights
This collaborative approach proved invaluable—each agent brought domain expertise that shaped the final design.
The Technical Foundation
Database Schema Design
Our ML expert proposed a dedicated WorkflowInsight model rather than extending our existing MemoryEntry system. The key insight was that workflow memories have different requirements than general system memories:
model WorkflowInsight {
id String @id @default(cuid())
projectId String
title String
content String
tags String[]
painPoint String? // What problem this insight addresses
solution String? // How it solves the problem
pairedInsightId String? // Link pain points to solutions
embedding Vector(1536) // pgvector for semantic search
searchVector String? // tsvector for full-text search
createdAt DateTime @default(now())
project Project @relation(fields: [projectId], references: [id])
pairedInsight WorkflowInsight? @relation("InsightPairing")
@@index([projectId])
@@index(embedding vector_cosine_ops) // HNSW index for vector search
}
Hybrid Search Strategy
We implemented a hybrid approach combining vector similarity with traditional full-text search:
- 70% weight on vector similarity for semantic understanding
- 30% weight on full-text search for exact keyword matches
- Pain-solution pairing to surface both problems and their solutions
Template Variable Integration
The system integrates seamlessly with our workflow engine through a simple {{memory}} template variable that gets replaced with relevant insights during workflow creation.
User Experience Design
Saving Insights: The Review Flow
After a workflow review is approved, users see a SaveInsightsDialog with:
- Checkbox list of key points from the review
- Project selector for proper categorization
- Tag input for custom labeling
- Pain/solution pairing options
Selecting Context: The Memory Picker
When creating new workflows, the MemoryPicker component provides:
- Real-time search across saved insights
- Filter chips for projects, tags, and date ranges
- Selected insight chips with easy removal
- Collapsible preview of insight content
- One-click injection via the
{{memory}}template variable
Implementation Challenges and Lessons Learned
Shell Escaping Gotchas
Working with dynamic code execution revealed some interesting shell behavior:
The Problem: Running inline TypeScript with special characters
# This fails due to zsh glob expansion
npx tsx -e 'console.log("Hello!")'
The Solution: Write temporary script files
# Much more reliable approach
echo 'console.log("Hello!")' > temp.ts
npx tsx temp.ts
rm temp.ts
Git Path Handling
Another shell gotcha emerged with Next.js dynamic routes:
# Fails due to bracket expansion
git add src/app/(dashboard)/[id]/page.tsx
# Works correctly
git add "src/app/(dashboard)/[id]/page.tsx"
These seemingly minor issues can derail late-night coding sessions, so documenting them saves future frustration.
The Refinement Mode Breakthrough
While implementing the memory system, we also shipped a significant improvement to our workflow refinement process. Instead of regenerating outputs from scratch, the system now:
- Takes the previous output as a starting point
- Applies specific review feedback
- Uses a
REFINE_SEPARATORto cleanly replace refined sections
This approach is much more efficient and produces better results by building incrementally rather than starting over.
Current Status and Next Steps
The backend implementation is nearly complete:
- ✅ Database schema designed
- ✅ Service layer architecture planned
- ✅ tRPC procedures mapped out
- ✅ Template variable integration designed
- 🚧 Currently implementing the core services
Next up:
- Complete the backend implementation
- Build the UI components (
SaveInsightsDialogandMemoryPicker) - Integrate with the workflow creation and review flows
- Add pgvector extension for production-grade vector search
- Implement the embedding service with BYOK (Bring Your Own Key) support
Reflection: The Power of Systematic Approach
This project reinforced the value of breaking complex features into discrete, well-defined tasks and leveraging specialized expertise (even if that expertise comes from AI agents). The memory system will fundamentally change how our workflows learn and improve over time.
The late-night development session that started with a simple idea—"workflows should remember insights"—evolved into a comprehensive system spanning database design, vector search, user experience, and template processing. Sometimes the best solutions emerge when you give complex problems the time and systematic attention they deserve.
What memory systems are you building into your AI workflows? I'd love to hear about your approaches to making AI systems more contextually aware and intelligent over time.