The MCP Pattern: SQLite as the AI-Queryable Cache

March 20, 2026

I keep building the same thing.

Not the same product — the products are different. One indexes a Hugo blog. One indexes AI conversations. One consolidates medical records from three hospitals. One catalogs a hundred git repositories. But underneath, they all have the same skeleton. After the fifth time, I think the skeleton deserves a name.

The pattern

Domain files (ground truth)
     index
SQLite database (read-only cache, FTS5)
     expose
MCP server (tools + resources  AI assistant)

That’s it. Three layers. The domain files are always canonical — the database is a disposable cache you can rebuild from them at any time. SQLite gives you structured queries, full-text search, and JSON extraction over data that was previously trapped in flat files. MCP exposes it to an AI assistant that can write SQL, retrieve content, and (in some cases) create new content.

Here’s the inventory:

ProjectDomainGround TruthWhat the MCP Exposes
hugo-memexBlog contentMarkdown files with YAML front matter951 pages, FTS5 search, taxonomy queries, JSON front matter extraction
memexAI conversationsChatGPT/Claude/Gemini exportsConversation trees, FTS5 message search, tags, enrichments
chartfoldMedical recordsEpic, MEDITECH, athenahealth exportsLabs, meds, encounters, imaging, pathology, cross-source reconciliation
arkivPersonal archivesJSONL files from various sourcesUnified SQL over heterogeneous personal data
repoindexGit repositoriesLocal git repos + GitHub/PyPI/CRAN metadataRepository catalog with activity tracking, publication status

Five projects. Five completely different domains. One architecture.

Why SQLite

SQLite is the most deployed database in history. It’s on every phone, every browser, every Python installation. But that’s not why I use it.

I use it because it solves three problems at once:

Structured queries over unstructured data. Hugo front matter is YAML trapped inside markdown files. Medical records are scattered across three incompatible EHR export formats. AI conversations are JSON trees with branching paths. SQLite turns all of these into tables you can JOIN, GROUP BY, and aggregate. json_extract() handles the long tail of fields that don’t fit a fixed schema.

Full-text search. FTS5 with porter stemming and unicode61 tokenization gives you relevance-ranked search across any text corpus. No Elasticsearch, no external service, no running daemon. Just a virtual table that lives in the same database file.

Read More

Chartfold: Owning Your Medical Records

February 24, 2026

I have cancer. My oncologist is at one hospital system (Siteman/BJC), my primary care doctor at another, and my earlier treatment history lives at a third (Anderson, where my first oncologist practiced). Patient portals are fine for browsing, but they don’t answer questions. They show you your data one lab result at a time, one note at a time, one visit at a time.

I wanted to run queries against my medical records. Correlate lab trends with treatment changes. Generate structured question lists before oncology visits. Ask “what changed since my last appointment” and get a real answer. That means getting the data out of the portal and into something programmable.

Chartfold loads EHR exports into SQLite and exposes them to Claude via MCP.

Read More

Code Without Purpose

February 24, 2026

Time is finite in ways I can’t ignore. That changes which questions about code feel important.

I read a post arguing that the most valuable programming skill in 2026 is deleting code. The thesis: AI generates code faster than anyone can review it, so the real value is in curation and subtraction. Code is a liability, not an asset.

I agree with the observation. I disagree with the prescription.

Read More