Crafting Project Sync: Navigating Database Relations and GitHub APIs
Join us as we recount a recent development sprint focused on building a robust Project Sync feature, tackling schema design, GitHub API integrations, and a tricky Prisma self-relation challenge.
Building intelligent systems that understand our codebases requires a constant dance between the code itself and the data representing it. Our latest endeavor, "Project Sync," aims to bridge this gap, ensuring our internal knowledge base stays perfectly aligned with the ever-evolving state of our repositories. This post takes you behind the scenes of a recent development sprint, detailing the foundational work, the challenges we overcame, and the exciting path ahead.
Our goal for Phase 1 of Project Sync was ambitious: implement core synchronization logic, including branch selection, across 13 key tasks outlined in our design document. We're currently four tasks in and moving swiftly!
Laying the Foundation: Syncing Schema and Services
The first phase was all about establishing a robust backend that could intelligently track and manage synchronization. This involved a multi-pronged approach:
1. Schema Evolution: The ProjectSync Model
At the heart of Project Sync is the need to track when and how a repository's state was synchronized. This led to the creation of the ProjectSync model in our prisma/schema.prisma. More importantly, we extended existing models like MemoryEntry, RepositoryFile, and Repository with new fields to associate them directly with specific sync operations. This ensures we can trace every piece of data back to its origin and manage its lifecycle effectively.
// Simplified example of schema changes
model ProjectSync {
id String @id @default(cuid())
repositoryId String
repository Repository @relation(fields: [repositoryId], references: [id])
branch String
commitSha String
// ... other sync details
// Self-relation to track previous syncs (more on this later!)
previousSyncId String? @unique // The unique constraint was a key lesson!
previousSync ProjectSync? @relation("ProjectSyncHistory", fields: [previousSyncId], references: [id])
nextSyncs ProjectSync[] @relation("ProjectSyncHistory")
memoryEntries MemoryEntry[]
repositoryFiles RepositoryFile[]
}
// Existing models extended
model MemoryEntry {
// ...
projectSyncId String?
projectSync ProjectSync? @relation(fields: [projectSyncId], references: [id])
}
model RepositoryFile {
// ...
projectSyncId String?
projectSync ProjectSync? @relation(fields: [projectSyncId], references: [id])
}
2. Deepening the GitHub Connection
To synchronize, we first need to understand the remote repository. We enhanced our src/server/services/github-connector.ts with critical new capabilities:
fetchBranches(): To list all available branches for a given repository. This is crucial for allowing users to select which branch they want to sync.fetchBranchHead(): To get the latest commit SHA for a specific branch, ensuring we're always working with the freshest data.fetchRepoTreeWithSha(): This powerful function returns a detailedTreeEntry[], providingpath,sha, andsizefor all files in a repository at a given commit. This granular information is vital for detecting changes and deciding what needs to be synchronized.
3. The Sync Engine: An AsyncGenerator Pipeline
The true brain of the operation resides in src/server/services/project-sync-service.ts. We architected a full AsyncGenerator pipeline to handle the complex synchronization flow:
prepare: Initializes the sync, fetching metadata and setting up the sync context.scan: Compares the current repository state with our last known state, identifying new, modified, or deleted files. This is where the "diff-aware" magic happens.import: Processes the identified changes, updatingMemoryEntryandRepositoryFilerecords in our database.finalize: Cleans up, marks the sync as complete, and updates any necessary metadata.
This pipeline ensures a robust, step-by-step, and observable synchronization process, crucial for a feature of this complexity. It builds upon earlier successes, like our backfill endpoint which recently restored 382 embeddings on production, giving us confidence in our underlying infrastructure.
Lessons from the Trenches: The Prisma Relation Challenge
Development is rarely a straight line. One of the most critical lessons from this session came from a subtle but significant Prisma validation error.
We designed our ProjectSync model to include a self-referencing one-to-one relation: previousSyncId. This was intended to create a historical chain, allowing us to easily trace a sync back to its predecessor.
// Initial (problematic) attempt
model ProjectSync {
// ...
previousSyncId String?
previousSync ProjectSync? @relation("ProjectSyncHistory", fields: [previousSyncId], references: [id])
nextSyncs ProjectSync[] @relation("ProjectSyncHistory")
}
When trying to apply this schema, Prisma threw a validation error: "A one-to-one relation must use @unique on the foreign key field."
The Insight: While previousSyncId is nullable (a new sync wouldn't have a previous one), if a ProjectSync does point to a previousSyncId, that previousSyncId can only be referenced once by a "next" sync in a one-to-one relationship. Meaning, a specific ProjectSync record can only ever be the immediate previous sync for one other ProjectSync record. Without @unique, Prisma couldn't guarantee this uniqueness constraint, violating the definition of a one-to-one relation.
The Fix: The solution was straightforward but required understanding this nuance: adding @unique to previousSyncId.
// Corrected schema
model ProjectSync {
// ...
previousSyncId String? @unique // This was the key!
previousSync ProjectSync? @relation("ProjectSyncHistory", fields: [previousSyncId], references: [id])
nextSyncs ProjectSync[] @relation("ProjectSyncHistory")
}
This highlighted the importance of understanding Prisma's relation constraints deeply, especially with self-referencing models. It also served as a good reminder about the importance of managing our database migrations carefully: always npx prisma@5.22.0 db push locally first, and plan for a safe migration on production. And for local development, remember to set that DATABASE_URL environment variable if you're not using a .env file!
The Road Ahead: Bringing Sync to Life
With the foundational backend services and schema in place, our immediate next steps are focused on bringing Project Sync to life for our users:
API & Event Streaming
- Task 5: SSE Endpoint: Developing
/api/v1/events/project-sync/[syncId]/route.tsto provide real-time updates on sync progress. - Task 6: tRPC Sync Sub-router: Building out the tRPC API for managing sync operations (listing branches, starting a sync, checking status, viewing history, restoring memory).
Frontend Integration
- Task 7:
useProjectSyncHook: Creating a React hook to abstract away sync logic for our UI components. - Task 8:
SyncBannerComponent: A user-facing banner to display ongoing sync status and notifications. - Task 9:
SyncControlsComponent: Interactive elements for users to initiate and manage syncs. - Task 10: Integrate into Project Overview: Seamlessly embedding sync functionality into our project dashboard.
Polish & Deployment
- Task 11: Filter Superseded Entries: Implementing logic to ensure active queries only show the most relevant, non-superseded memory entries.
- Task 12: Typecheck + Build Verification: The crucial step of ensuring everything compiles and passes tests.
- Task 13: Production Deployment: Pushing our changes, running safe migrations, and rebuilding for production.
Conclusion
This development sprint has been a significant leap forward for Project Sync. We've laid a robust backend foundation, integrated deeply with the GitHub API, and refined our database schema to handle complex historical tracking. The lessons learned from the Prisma self-relation issue have only strengthened our understanding of schema design best practices. We're excited to transition from backend logic to user-facing features, bringing this powerful synchronization capability directly into the hands of our users. Stay tuned for more updates as Project Sync takes shape!