Building a GitHub-to-Blog Pipeline: From Memory Files to Published Posts
How I built an end-to-end pipeline that transforms GitHub repository memories into blog posts, and the unexpected challenges that taught me valuable lessons about React Query, tRPC, and Next.js caching.
Building a GitHub-to-Blog Pipeline: From Memory Files to Published Posts
Ever wished you could automatically turn your project development notes into polished blog posts? That's exactly what I set out to build: a complete pipeline that connects to GitHub repositories, imports "memory files" (development session notes), and generates publication-ready blog content using AI.
After a full development session, I'm excited to share that the feature is not only working but has already generated 9+ blog posts from a single repository import. Here's the journey, including the unexpected challenges that became valuable learning experiences.
The Vision: Project-Based Blog Generation
The core idea was simple: create a system where developers can:
- Connect their GitHub repositories with their own API tokens (BYOK approach)
- Import memory files from a standardized
/memorydirectory structure - Generate blog posts using AI, with full markdown rendering and mobile-first design
- Manage everything through a clean, tabbed interface
Technical Architecture
Database Design
I started with a clean Prisma schema focusing on tenant isolation:
model Project {
id String @id @default(cuid())
name String
description String?
githubRepo String?
userId String
blogPosts BlogPost[]
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
}
model BlogPost {
id String @id @default(cuid())
title String
content String @db.Text
projectId String
project Project @relation(fields: [projectId], references: [id])
// ... additional fields
}
GitHub Integration
The GitHub connector became the heart of the system, handling:
- Token resolution with user-provided API keys
- Repository fetching with proper error handling
- Memory path validation (checking for
/memorydirectories) - File content retrieval and database synchronization
// Key insight: Always validate the memory structure exists
export async function checkMemoryPath(
repoFullName: string,
token: string
): Promise<boolean> {
try {
const response = await fetch(
`https://api.github.com/repos/${repoFullName}/contents/memory`,
{ headers: { Authorization: `Bearer ${token}` } }
);
return response.ok;
} catch {
return false;
}
}
AI Blog Generation
I ported an existing Python blog generation prompt to TypeScript, integrating with Anthropic's Claude:
export async function generateBlogPost(
memoryContent: string,
fileName: string
): Promise<BlogPostContent> {
const prompt = `Transform this development session memory into an engaging,
public-ready blog post...`;
// Full implementation handles markdown formatting,
// frontmatter extraction, and content optimization
}
The User Experience
The final interface provides a smooth workflow:
- Projects Dashboard: Overview of all projects with quick actions
- New Project Creation: Simple form with GitHub integration
- Project Detail View: Tabbed interface showing:
- Memory files with import status
- Generated blog posts with preview
- Batch operations for bulk generation
- Mobile-First Design: Responsive layouts with touch-friendly controls
Lessons Learned: The Challenges That Made It Better
Challenge #1: tRPC Query Refetch Patterns
The Problem: I tried to use reposQuery.refetch() on a tRPC query that had enabled: false. The button click did absolutely nothing.
The Learning: In tRPC with React Query v5, calling refetch() on disabled queries is unreliable. The solution was elegantly simple:
// Instead of this:
const reposQuery = trpc.github.repos.useQuery(input, { enabled: false });
// Then trying: reposQuery.refetch()
// Do this:
const [loadRepos, setLoadRepos] = useState(false);
const reposQuery = trpc.github.repos.useQuery(input, { enabled: loadRepos });
// Then: setLoadRepos(true)
Key Takeaway: State-driven queries are more predictable than imperative refetch calls.
Challenge #2: Real-Time Progress in Batch Operations
The Problem: I initially built a server-side batch generation endpoint that processed multiple files sequentially. The client would show "0/10 generating..." and never update until all were complete.
The Learning: For operations where user feedback matters, client-side orchestration often works better:
// Instead of server-side batch processing:
for (const entry of selectedEntries) {
await generateSingle.mutateAsync({
projectId,
memoryId: entry.id
});
setProgress(prev => prev + 1); // Live updates!
}
Key Takeaway: Consider where the user experience is best served—sometimes client-side coordination trumps server-side efficiency.
Challenge #3: Next.js Cache Corruption
The Problem: After making Prisma schema changes, I tried to clear the .next cache while the dev server was still running. This caused mysterious clientModules errors that persisted across restarts.
The Solution: The order matters:
- Stop the dev server
- Run
rm -rf .next - Run
prisma generate(after schema changes) - Restart the dev server
Key Takeaway: Next.js cache management requires respect for the development lifecycle.
The Results
After implementing all features and resolving the challenges, the system successfully:
- ✅ Imported 10 memory files from a real repository (
mrwind-up-bird/mini-chat-rag) - ✅ Generated 9+ publication-ready blog posts
- ✅ Provided a smooth mobile and desktop experience
- ✅ Handled errors gracefully with proper user feedback
What's Next
The foundation is solid, but there are exciting opportunities ahead:
- Enhanced Error Handling: Toast notifications for better user feedback
- Performance Optimization: Pagination for projects with many posts
- Content Management: In-place editing and regeneration workflows
- Template System: Customizable blog post templates for different content types
Final Thoughts
Building this pipeline reinforced a key principle: the best technical solutions often emerge from embracing constraints rather than fighting them. Whether it's working with tRPC's query patterns, respecting Next.js's cache lifecycle, or choosing client-side coordination for better UX, the "limitations" often guide us toward more robust architectures.
The ability to transform raw development memories into polished blog content opens up fascinating possibilities for developer documentation, project storytelling, and knowledge sharing. Sometimes the most powerful tools are the ones that help us better communicate the work we're already doing.
This blog post was itself generated using the pipeline described above, demonstrating the system in action. The irony is not lost on me.