Never Forget a Lesson: Building a Context-Aware Workflow Memory with Vector Search

In the rapidly evolving world of AI-driven applications, large language models (LLMs) are incredibly powerful. Yet, one common challenge persists: their lack of long-term, project-specific memory. Imagine a complex development workflow where an LLM helps refine features, review code, or brainstorm solutions. Without a mechanism to remember past insights, decisions, and lessons learned, it's like starting from scratch with each new task.

That's precisely the problem we set out to solve: building a project-workflow memory system. Our goal is to capture critical insights generated during workflow steps—like "pain points" or "strengths" identified in a review—and make them searchable and selectable as context for future workflows. The vision? A simple {{memory}} template variable that, when used in a prompt, injects relevant, previously saved insights, empowering our LLMs with project-specific wisdom.

This post chronicles a recent, intense development sprint where we laid the foundational backend and critical UI pieces for this ambitious system.

The Anatomy of an Insight: Our `WorkflowInsight` Model

At the heart of our memory system is the WorkflowInsight model. We didn't just want to store raw text; we needed structured knowledge that could be easily traced, categorized, and searched.

Our Prisma schema for WorkflowInsight reflects this need for rich metadata:

Traceability: workflowId, stepId, stepLabel, projectId – essential for understanding where an insight came from.
Content: title, detail, suggestion (for actionable advice), category (e.g., "Design," "Code," "UX"), insightType (e.g., "Pain Point," "Strength"), severity.
Relationships: pairedInsightId for linking a "pain point" to a corresponding "solution" or "strength."
Searchability: tsvector for traditional full-text search (PostgreSQL's powerful feature) and, crucially, an embedding column (for pgvector) to enable semantic vector search.

typescript

// Simplified WorkflowInsight model structure
model WorkflowInsight {
  id              String    @id @default(uuid())
  workflowId      String
  stepId          String
  stepLabel       String
  projectId       String
  title           String
  detail          String?
  suggestion      String?
  category        String
  insightType     InsightType // Enum: PainPoint, Strength
  severity        Severity?   // Enum: Low, Medium, High
  pairedInsightId String?     @db.Uuid
  createdAt       DateTime  @default(now())
  updatedAt       DateTime  @updatedAt

  // Full-text search index (tsvector)
  @@fulltext([title, detail, suggestion])

  // Placeholder for pgvector embedding column, managed via raw SQL for now
  // embedding       Unsupported("vector(1536)")? // This caused issues, see 'Challenges'
}

The Engine Room: Persistence & Search

Building out the WorkflowInsight model was just the beginning. We needed services to manage these insights:

Embedding Service (src/server/services/embedding-service.ts): This is where the magic of AI meets our data. We leveraged OpenAI's text-embedding-3-small model (1536 dimensions) to convert our structured insights into dense vector representations. A neat trick here is buildEmbeddingText(), which prefixes the insight content (e.g., "Category: Design. Title: Too complex. Detail: Users struggled...") to provide better context for the embedding model.
Insight Persistence (src/server/services/insight-persistence.ts): The persistReviewInsights() function takes raw review feedback, maps it to our WorkflowInsight structure, and intelligently auto-pairs pain points with relevant strengths based on category and source step. This ensures our memory isn't just a dump, but a structured knowledge base.
Insight Search (src/server/services/insight-search.ts): To retrieve relevant memories, we implemented a hybrid search strategy. It combines 70% vector similarity search (finding semantically similar insights) with 30% traditional tsvector full-text search (for exact keyword matches). This blend provides both accuracy and recall, powered by Prisma.sql tagged templates for secure, raw SQL execution.

Integrating Memory into Workflows

The ultimate goal is to make these insights accessible within our workflow prompts.

Workflow Insights Loader (src/server/services/workflow-insights.ts): This service, loadMemoryContent(memoryIds), retrieves selected insights and formats them into a clean, grouped markdown string. This ensures the injected context is readable and useful for the LLM.
{{memory}} Template Resolution: Our workflow engine now recognizes the {{memory}} template variable. When a workflow is executed, this variable is parallel-loaded with other context and resolved into the formatted markdown of selected insights before being sent to the LLM.

Bringing it to Life: UI & API

A powerful backend is nothing without a user-friendly interface.

SaveInsightsDialog: After a review step, users are presented with a dialog (0259732) showing a checkbox list of potential insights derived from the review. They can keep, edit, or discard individual points, ensuring only valuable information gets saved.
tRPC Procedures: We exposed the necessary functionality through tRPC, our type-safe API layer. This includes saveInsights, searchInsights (with enum-validated filters for security), and projectInsights for listing all insights related to a project.
Type Consolidation: To maintain consistency and reduce duplication, ReviewKeyPoint types were moved to a central src/types/review.ts and re-exported for both server and client modules.

Foundations: Infrastructure & Security

No system is complete without solid infrastructure and rigorous security.

Docker Upgrade for pgvector: A critical step was upgrading our Docker PostgreSQL image from postgres:16-alpine to pgvector/pgvector:pg16. This specialized image comes pre-bundled with the pgvector extension, essential for our vector embedding column.
Schema Push & RLS: The workflow_insights table was created, complete with Row-Level Security (RLS) policies to ensure data isolation, and a tsvector trigger for automatic full-text indexing.
Security Review & SQL Injection Fix: During a crucial security review, a critical SQL injection vulnerability was identified in insight-search.ts. Our initial use of $queryRawUnsafe() with string interpolation for user-supplied filter arrays was a dangerous oversight. This was immediately fixed by switching to parameterized Prisma.sql tagged templates and tightening tRPC input validation with enums and max lengths. A stark, but valuable, lesson learned.

Navigating the Rough Seas: Challenges & Lessons Learned

Development sprints are rarely without their hurdles. Here are some "pain points" we encountered and how we navigated them:

The `pgvector` Saga: Prisma and Postgres Extension Mismatch

Problem: We initially tried to define the embedding column directly in Prisma as Unsupported("vector(1536)"). This led to a db:push error: Unsupported("vector(1536)"). Even after removing that, attempting to create the column via raw SQL resulted in ERROR: type "vector" does not exist.
Root Cause: Our Docker Postgres container (postgres:16-alpine) did not have the pgvector extension installed or enabled by default. Prisma's Unsupported type is a placeholder, but the underlying database still needs the actual type.
Solution: We opted for a two-phase approach. First, we removed the embedding column from the Prisma schema entirely and managed it via raw SQL. Second, and critically, we upgraded our docker-compose.yml to use pgvector/pgvector:pg16. This ensures that on the next container recreation, the pgvector extension will be available, allowing us to install it and add the embedding column.

Shell Scripting Gotchas: Zsh Glob Expansion

Problem: While running an npx tsx -e '...' command for a quick inline test, we hit a bizarre zsh glob expansion error when using ! within the inline code string.
Solution: The simplest workaround was to write the test script to a temporary .ts file and then execute it via npx tsx /path/to/temp/file.ts. Sometimes, the quickest path isn't the cleverest one-liner.

SQL Injection: A Stark Reminder of Security Best Practices

Problem: As mentioned, our initial implementation of hybrid search in insight-search.ts used $queryRawUnsafe() with direct string interpolation for user-supplied filter arrays. This was a critical SQL injection vulnerability.
Lesson: Never trust user input directly in raw SQL queries. Always use parameterized queries. Prisma.sql tagged templates are designed precisely for this, safely sanitizing and parameterizing inputs. Coupled with strict input validation on the tRPC layer (enums, regex, max lengths), we now have a much more robust and secure API.

On the Horizon: What's Next for Workflow Memory

This sprint has laid a robust foundation, but the journey continues:

UI Testing: Thoroughly test the SaveInsightsDialog in the browser to ensure a smooth user experience and correct data persistence.
MemoryPicker Component: Develop a UI component for the workflow creation page, allowing users to search, filter, and select relevant insights to inject into their new workflows.
MemoryContextPreview: Build a collapsible preview component to show exactly what {{memory}} will resolve to before a workflow runs.
Full Integration: Connect the MemoryPicker into the workflow creation page (new/page.tsx).
End-to-End Testing: Validate the entire flow: creating a workflow with selected memory insights and verifying that {{memory}} correctly resolves in prompts.
Phase 2: Full pgvector Integration: Recreate the Docker Postgres container with the pgvector image, install the extension, add the embedding column, and fully enable vector search.
API Consolidation: Review and consolidate the saveInsights procedure, which currently exists in two tRPC routers, to a single, authoritative location.

Conclusion

Building a context-aware workflow memory system is a significant step towards making our AI-powered workflows truly intelligent and efficient. By transforming raw feedback into structured, searchable WorkflowInsight records and leveraging the power of vector search, we're equipping our LLMs with the ability to "remember" and learn from every interaction. The challenges we faced, particularly with pgvector integration and a critical SQL injection fix, were invaluable learning experiences that ultimately led to a more robust and secure system.

We're excited about the future of this feature, envisioning workflows that evolve and improve with every lesson learned, truly giving our projects a long-term memory.

The Anatomy of an Insight: Our WorkflowInsight Model