Mission Accomplished: Unlocking Semantic Memory for AI Workflows with pgvector
We just hit a major milestone! Our AI workflow memory system is now fully operational, featuring pgvector for semantic search, HNSW indexing, and an intuitive UI. Dive into the journey of building intelligent recall for our LLM-powered applications.
Ever wished your AI workflows had a perfect memory? A system that not only stores past insights but truly understands them, allowing you to instantly recall relevant context when crafting new prompts? Well, after an intense development sprint, I'm thrilled to announce: we've done it!
Our full project-workflow memory system, complete with pgvector Phase 2 capabilities – vector embeddings, HNSW indexing, and lightning-fast similarity search – is now fully operational, end-to-end. All the hard work from this session is safely pushed to origin/main at 9b924f9. This isn't just about storing data; it's about enabling our LLM-powered applications to access a rich, semantically searchable history, transforming how we build and interact with AI.
Let's dive into what made this breakthrough possible.
The Journey to Intelligent Recall: What We Built
This session was all about bringing our vision of an intelligent memory system to life. Here's a breakdown of the key accomplishments:
1. The Power of pgvector: Phase 2 Complete!
This was the crown jewel of the session. We successfully integrated pgvector to supercharge our insight storage with semantic search capabilities.
- Database Migration: Swapped our
postgres:16-alpineDocker image for the specializedpgvector/pgvector:pg16image. Crucially, we preserved all existing data using a named volume – no lost history! - Vector Extension: Enabled the
vectorextension (v0.8.1) in our database:sqlCREATE EXTENSION vector; - Embedding Storage: Added a new
embeddingcolumn to ourworkflow_insightstable to store the vector representations of our insights:sql(We're using OpenAI'sALTER TABLE workflow_insights ADD COLUMN embedding vector(1536);text-embedding-3-smallwhich outputs 1536-dimensional vectors). - HNSW Indexing: Implemented an HNSW (Hierarchical Navigable Small World) index for incredibly fast approximate nearest neighbor searches, making our similarity queries performant even with a growing dataset:
sql
CREATE INDEX ON workflow_insights USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64); - Backfilling Embeddings: Successfully backfilled all existing 10 insights with embeddings generated by
text-embedding-3-small, consuming approximately 680 tokens. - Verification: Crucially, we verified that cosine similarity search now returns sensible, semantically clustered results. This means our system truly "understands" the meaning behind our stored insights.
2. Bringing Memory to the User: The MemoryPicker
What's powerful backend without a great frontend? We built the MemoryPicker component (src/components/workflow/memory-picker.tsx) to allow users to interact with our new memory system.
- Intuitive Search: A robust search interface to find relevant insights.
- Category Chips & Severity Badges: Visual cues to quickly filter and understand the context of each memory.
- Expandable Detail: Users can expand insights for a full view without cluttering the interface.
{{memory}}Preview: A direct preview of how selected memories will be injected into the LLM prompt.- Seamless Integration: The
MemoryPickeris now fully integrated intonew/page.tsx, managingmemoryIdsstate which feeds directly into our workflow creation mutation.
3. Smart Context Injection: The {{memory}} Template
The core mechanism for leveraging our memory system is the {{memory}} template. We finalized its end-to-end functionality:
- Selected insights from the
MemoryPickerare now reliably injected into the LLM prompt via the{{memory}}template inworkflow-engine.ts. - An E2E test (
b5588a67) successfully verified that the LLM received the injected content from 5 selected insights, confirming the entire flow works as intended.
4. UI Polish & Workflow Enhancements
Beyond the core memory system, we made several improvements to enhance the user experience and developer workflow:
saveInsightsConsolidation: Streamlined our insight saving logic by removing a duplicate function fromworkflows.ts, ensuringmemory.saveInsightsis the single source of truth (93017a6).SaveInsightsDialogFix: Addressed a subtle bug where the dialog wouldn't appear if key points lacked anactionfield. The fix (!kp.action || kp.action === "keep" || kp.action === "edit") makes it robust for all insight states (95be3e8).CollapsibleSectionComponent: Introduced a reusable component to wrap Consolidations, Personas, Docs, and Memory sections, collapsing them by default and displaying a badge with the item count (f998772). This significantly cleans up the UI.memory-pullScript: A handyscripts/memory-pull.shutility to fetch.memory/from remote, even including a--watchmode for live updates (bb7b1c8). Great for developer convenience!
Navigating the Hurdles: Lessons Learned
No development sprint is without its challenges. Here are a few "pain points" we encountered and the valuable lessons we extracted:
-
Problem: The
SaveInsightsDialogsometimes failed to appear.- Root Cause: Our filtering logic
kp.action === "keep"implicitly assumed all key points would have anactionfield. However, newly extracted key points often lack this field (it'sundefined). - Lesson Learned: Always account for all possible states of data, especially
undefinedor null values, in UI logic. Explicitly checking for!kp.actionor providing default values is crucial for robustness. - Fix:
!kp.action || kp.action === "keep" || kp.action === "edit"
- Root Cause: Our filtering logic
-
Problem: Running a standalone
tsxscript from/tmpfailed withCannot find module '@prisma/client'.- Root Cause:
tsxcouldn't resolve modules from the project'snode_moduleswhen the script was executed from an arbitrary temporary directory. - Lesson Learned: For Node.js/TypeScript scripts that rely on project dependencies, ensure they are run from within the project's context (e.g., a dedicated
scripts/directory) or configure module paths explicitly. - Workaround: Copied the script into the project's
scripts/directory and ran it from there.
- Root Cause:
-
Problem: Using
statusas a variable name in awhile true; do status=$(...)loop inzshresulted in aread-only variable: statuserror.- Root Cause:
statusis a reserved variable name inzsh. - Lesson Learned: Be mindful of shell-specific reserved keywords and variables. A quick search or using a more unique variable name can prevent such conflicts.
- Workaround: Used
step_statusinstead.
- Root Cause:
-
Future Improvement: Noticed that the "Save" button on the
SaveInsightsDialogcan be clicked multiple times, potentially causing duplicate entries.- Lesson Learned: User interfaces should always anticipate common interaction patterns. Implementing a simple dedup guard (e.g., disabling the button after the first click, or client-side deduplication) is a good practice for data integrity. (This is an immediate next step!)
Current State of Play
As of this moment:
- Our development server is humming along at
http://localhost:3000. - Docker is running
pgvector/pgvector:pg16with the vector extension fully installed. - We have 10
WorkflowInsightrecords, all equipped with embeddings in ourworkflow_insightstable. - The HNSW index is actively serving fast similarity searches on the embedding column.
- All code is pushed and secure on
origin/mainat9b924f9.
What's Next on the Horizon?
While we've hit a major milestone, the journey continues! Here are our immediate next steps to further refine and enhance our intelligent memory system:
- Duplicate-Save Guard: Implement the aforementioned duplicate-save guard on the
SaveInsightsDialogto prevent accidental redundant entries. - Hybrid Search Integration: Wire up our
insight-search.tshybrid search (combining 70% vector similarity with 30% text-based search) into theMemoryPickersearch field for even more powerful and nuanced semantic search. - Automatic Embedding Generation: Integrate auto-generation of embeddings directly into
insight-persistence.tsto ensure all new insights automatically get their vector representations. We'll verify this works seamlessly withpgvectorlive. - Project-Scoped Filtering: Add project-scoped filtering to the
MemoryPickerto allow users to narrow down memories relevant to their current project context. - Code Cleanup: A quick pass to clean up stale
.logfiles in the project root.
This session has been incredibly productive, culminating in a fully functional, intelligent memory system that will significantly enhance our AI workflows. The ability to semantically search and inject past insights directly into LLM prompts opens up a new world of possibilities for building more context-aware and powerful applications. Stay tuned for more updates!