Building a GitHub-to-Blog Pipeline: From Memory Files to Published Posts

Ever wished you could automatically turn your project development notes into polished blog posts? That's exactly what I set out to build: a complete pipeline that connects to GitHub repositories, imports "memory files" (development session notes), and generates publication-ready blog content using AI.

After a full development session, I'm excited to share that the feature is not only working but has already generated 9+ blog posts from a single repository import. Here's the journey, including the unexpected challenges that became valuable learning experiences.

The Vision: Project-Based Blog Generation

The core idea was simple: create a system where developers can:

Connect their GitHub repositories with their own API tokens (BYOK approach)
Import memory files from a standardized /memory directory structure
Generate blog posts using AI, with full markdown rendering and mobile-first design
Manage everything through a clean, tabbed interface

Technical Architecture

Database Design

I started with a clean Prisma schema focusing on tenant isolation:

typescript

model Project {
  id          String     @id @default(cuid())
  name        String
  description String?
  githubRepo  String?
  userId      String
  blogPosts   BlogPost[]
  createdAt   DateTime   @default(now())
  updatedAt   DateTime   @updatedAt
}

model BlogPost {
  id        String   @id @default(cuid())
  title     String
  content   String   @db.Text
  projectId String
  project   Project  @relation(fields: [projectId], references: [id])
  // ... additional fields
}

GitHub Integration

The GitHub connector became the heart of the system, handling:

Token resolution with user-provided API keys
Repository fetching with proper error handling
Memory path validation (checking for /memory directories)
File content retrieval and database synchronization

typescript

// Key insight: Always validate the memory structure exists
export async function checkMemoryPath(
  repoFullName: string, 
  token: string
): Promise<boolean> {
  try {
    const response = await fetch(
      `https://api.github.com/repos/${repoFullName}/contents/memory`,
      { headers: { Authorization: `Bearer ${token}` } }
    );
    return response.ok;
  } catch {
    return false;
  }
}

AI Blog Generation

I ported an existing Python blog generation prompt to TypeScript, integrating with Anthropic's Claude:

typescript

export async function generateBlogPost(
  memoryContent: string,
  fileName: string
): Promise<BlogPostContent> {
  const prompt = `Transform this development session memory into an engaging, 
    public-ready blog post...`;
  
  // Full implementation handles markdown formatting,
  // frontmatter extraction, and content optimization
}

The User Experience

The final interface provides a smooth workflow:

Projects Dashboard: Overview of all projects with quick actions
New Project Creation: Simple form with GitHub integration
Project Detail View: Tabbed interface showing:
- Memory files with import status
- Generated blog posts with preview
- Batch operations for bulk generation
Mobile-First Design: Responsive layouts with touch-friendly controls

Lessons Learned: The Challenges That Made It Better

Challenge #1: tRPC Query Refetch Patterns

The Problem: I tried to use reposQuery.refetch() on a tRPC query that had enabled: false. The button click did absolutely nothing.

The Learning: In tRPC with React Query v5, calling refetch() on disabled queries is unreliable. The solution was elegantly simple:

typescript

// Instead of this:
const reposQuery = trpc.github.repos.useQuery(input, { enabled: false });
// Then trying: reposQuery.refetch()

// Do this:
const [loadRepos, setLoadRepos] = useState(false);
const reposQuery = trpc.github.repos.useQuery(input, { enabled: loadRepos });
// Then: setLoadRepos(true)

Key Takeaway: State-driven queries are more predictable than imperative refetch calls.

Challenge #2: Real-Time Progress in Batch Operations

The Problem: I initially built a server-side batch generation endpoint that processed multiple files sequentially. The client would show "0/10 generating..." and never update until all were complete.

The Learning: For operations where user feedback matters, client-side orchestration often works better:

typescript

// Instead of server-side batch processing:
for (const entry of selectedEntries) {
  await generateSingle.mutateAsync({
    projectId,
    memoryId: entry.id
  });
  setProgress(prev => prev + 1); // Live updates!
}

Key Takeaway: Consider where the user experience is best served—sometimes client-side coordination trumps server-side efficiency.

Challenge #3: Next.js Cache Corruption

The Problem: After making Prisma schema changes, I tried to clear the .next cache while the dev server was still running. This caused mysterious clientModules errors that persisted across restarts.

The Solution: The order matters:

Stop the dev server
Run rm -rf .next
Run prisma generate (after schema changes)
Restart the dev server

Key Takeaway: Next.js cache management requires respect for the development lifecycle.

The Results

After implementing all features and resolving the challenges, the system successfully:

✅ Imported 10 memory files from a real repository (mrwind-up-bird/mini-chat-rag)
✅ Generated 9+ publication-ready blog posts
✅ Provided a smooth mobile and desktop experience
✅ Handled errors gracefully with proper user feedback

What's Next

The foundation is solid, but there are exciting opportunities ahead:

Enhanced Error Handling: Toast notifications for better user feedback
Performance Optimization: Pagination for projects with many posts
Content Management: In-place editing and regeneration workflows
Template System: Customizable blog post templates for different content types

Final Thoughts

Building this pipeline reinforced a key principle: the best technical solutions often emerge from embracing constraints rather than fighting them. Whether it's working with tRPC's query patterns, respecting Next.js's cache lifecycle, or choosing client-side coordination for better UX, the "limitations" often guide us toward more robust architectures.

The ability to transform raw development memories into polished blog content opens up fascinating possibilities for developer documentation, project storytelling, and knowledge sharing. Sometimes the most powerful tools are the ones that help us better communicate the work we're already doing.

This blog post was itself generated using the pipeline described above, demonstrating the system in action. The irony is not lost on me.