active library Featured

arkiv

Universal personal data format. JSONL in, SQL out, SQL back to JSONL. One format, one database, one query interface.

Started 2026 Python

Resources & Distribution

Source Code

Package Registries

1 Stars

arkiv

Universal personal data format. JSONL in, SQL out, MCP to LLMs.

The Format

Every record is a JSON object. All fields optional.

{"mimetype": "text/plain", "content": "I think the key insight is...", "uri": "https://chatgpt.com/c/abc", "timestamp": "2023-05-14T10:30:00Z", "metadata": {"role": "user", "conversation_id": "abc"}}
{"mimetype": "audio/wav", "uri": "file://media/podcast.wav", "timestamp": "2024-01-15", "metadata": {"transcript": "Welcome to...", "duration": 45.2}}
{"mimetype": "image/jpeg", "uri": "file://media/photo.jpg", "metadata": {"caption": "My talk at MIT"}}

The Stack

JSONL files (canonical, portable, human-readable)
     arkiv import
SQLite database (queryable, efficient, standard SQL)
     arkiv mcp
MCP server (3 tools  any LLM)

Quick Start

pip install arkiv

# Import JSONL to SQLite
arkiv import conversations.jsonl --db archive.db

# Query
arkiv query archive.db "SELECT content FROM records WHERE metadata->>'role' = 'user' LIMIT 5"

# Serve to LLMs via MCP
arkiv mcp archive.db

MCP Tools

ToolDescription
get_manifest()What collections exist, their descriptions and schemas
get_schema(collection?)What metadata keys can be queried
sql_query(query)Run read-only SQL

Why

  • Your data lives in silos (ChatGPT, email, bookmarks, photos, voice memos)
  • Source toolkits (memex, mtk, btk, ptk, ebk) export it as JSONL
  • arkiv gives you one format, one database, one query interface
  • Any LLM can query it via MCP
  • JSONL is human-readable and durable. SQLite is the most deployed database in history.

Spec

See SPEC.md for the full technical specification.

Discussion