Ebk | metafunctor

Below you will find pages that utilize the taxonomy term “Ebk”

Long Echo Comes Alive: From Philosophy to Orchestration

January 20, 2026

A year ago, I wrote about Long Echo as a philosophy for preserving AI conversations across decades. The key insight was graceful degradation: design archives that work progressively even as technology disappears.

That philosophy has become a tool.

From Philosophy to Tool

The original Long Echo was intentionally not code. It was a set of principles documented in CTK’s repository. The hard problems of conversation parsing, storage, and search were already solved by toolkits like CTK, BTK, and EBK.

What was missing was the unification layer. Each toolkit exports its own ECHO-compliant archive, but combining them into a single browsable experience required manual work. That’s what longecho now handles.

What longecho Does Now

longecho is a CLI tool with five capabilities:

longecho check ~/my-data/       # Validate ECHO compliance
longecho discover ~/            # Find ECHO sources
longecho search ~/ "query"      # Search README descriptions
longecho build ~/my-archive/    # Generate static site
longecho serve ~/my-archive/    # Preview locally via HTTP

The check, discover, and search commands existed in the original specification. What’s new is build and serve, the orchestration layer.

Building a Unified Site

The build command takes a hierarchical archive and generates a static site:

longecho build ~/my-archive/

This produces a site/ directory with:

An index page linking to all sub-archives
Navigation between sources
Automatic linking to existing sub-site builds

If a sub-archive already has its own site/ directory (like CTK’s exports), longecho links to it. Use --bundle to copy everything into a portable, self-contained site.

Live Preview

The serve command provides local HTTP preview:

longecho serve ~/my-archive/ --port 8000

It builds the site if needed, then serves it for browser viewing.

The Manifest

ECHO compliance requires only a README. But for machine-readable metadata, longecho supports an optional manifest:

version: "1.0"
name: "Alex's Data Archive"
description: "Personal data archive"
sources:
  - path: "conversations/"
    order: 1
  - path: "bookmarks/"
    order: 2
  - path: "ebooks/"
    order: 3

The manifest enables:

Explicit ordering of sources in generated sites
Selective inclusion via the browsable flag
Override names for cleaner presentation
Icon hints for UI presentation

Without a manifest, longecho auto-discovers sub-archives by looking for directories with README files. The manifest provides explicit control when you need it.

Long Echo: Photos and Mail

January 19, 2026

The Long Echo toolkit now covers conversations, bookmarks, and ebooks. But two of the most emotionally significant categories of personal data remain: photos and mail.

Both share a troubling pattern: scattered across devices and cloud services, organized by date rather than meaning, vulnerable to platform disappearance. They deserve better.

The Expanding Ecosystem

Tool	Domain	Status
ctk	AI Conversations	stable
btk	Bookmarks & Media	stable
ebk	eBooks	stable
repoindex	Git Repositories	stable
ptk	Photos	incubating
mtk	Mail	incubating

The orchestration layer, longecho, ties these together into a unified personal archive.

PTK: Photo Toolkit

Photos are the most emotionally valuable digital artifacts most people have. They’re also among the worst-managed.

The Problem

Your photo library is probably:

Scattered: Phone, old phones, cloud services, camera imports, messaging app saves
Organized by date: Not by who’s in them, where they were taken, or what they mean
Cloud-dependent: Google Photos, iCloud, Amazon Photos. What happens when you switch?
Unsearchable by content: “Find photos of mom at the beach” isn’t possible
Missing context: Only you know why that blurry photo matters

The Vision

ptk provides:

Unified import from any source:

ptk import ~/Pictures/
ptk import ~/phone-backup/DCIM/
ptk import google-takeout.zip --source google-photos
ptk import icloud-export/ --source icloud

Intelligent organization by multiple dimensions:

ptk shell
ptk:/$ cd /people/mom
ptk:/people/mom$ ls
2019/  2020/  2021/  2022/  2023/  2024/

ptk:/$ cd /locations/beach
ptk:/$ cd /events/christmas-2023
ptk:/$ cd /years/2020/months/march

AI-powered features:

# Face detection and clustering
ptk faces detect --all
ptk faces cluster
ptk faces label cluster-7 "Mom"
ptk faces find "Mom"

# Scene captioning
ptk caption --all --model ollama/llava
ptk search "sunset over water"

# Semantic search
ptk ask "photos from our trip to Colorado"

Preservation guarantees:

# Verify nothing is corrupted
ptk verify --checksums

# Export to durable formats
ptk export ~/archive/photos/ --format longecho
ptk export photos.html --format html-gallery

# Original files always preserved
ptk originals list
ptk originals verify

Why SQLite?

Like the other Long Echo tools, ptk uses SQLite for metadata:

# Works even if ptk disappears
sqlite3 photos.db "
  SELECT path, caption, taken_at
  FROM photos
  WHERE caption LIKE '%birthday%'
  ORDER BY taken_at
"

The database stores metadata, face embeddings, captions, and organization. The actual photo files stay in place or are copied to a managed library, your choice.

The Long Echo Toolkit

December 16, 2025

Earlier this year I wrote about Long Echo, a philosophy for preserving AI conversations in ways that stay accessible across decades. The core idea was graceful degradation: systems that fail progressively, not catastrophically.

Since then I’ve built out three tools that apply this thinking to all personal digital content, not just conversations. Bookmarks, books, and AI chats. Together they form a system for managing the stuff you actually think with.

The Toolkit

Tool	Domain	Install
CTK	AI Conversations	`pip install conversation-tk`
BTK	Bookmarks & Media	`pip install bookmark-tk`
EBK	eBooks & Documents	`pip install ebk`

All three share a common architecture, but each is specialized for its domain.

Shared Architecture

SQLite-First Storage

Every tool uses local SQLite databases you own. No cloud dependency. Queryable with standard tools even if the CLI disappears tomorrow:

# Works even if the tools are gone
sqlite3 conversations.db "SELECT title FROM conversations WHERE title LIKE '%python%'"
sqlite3 bookmarks.db "SELECT url, title FROM bookmarks WHERE stars = 1"
sqlite3 library.db "SELECT title, author FROM books WHERE favorite = 1"

This is the whole point. The database is the artifact, not the tool.

Interactive Shells with Virtual Filesystems

Navigate your data like a Unix filesystem:

$ btk shell
btk:/$ cd tags/programming/python
btk:/tags/programming/python$ ls
3298  4095  5124  (bookmark IDs)
btk:/tags/programming/python$ cat 4095/title
Advanced Python Techniques

$ ebk shell
ebk:/$ cd authors/Knuth
ebk:/authors/Knuth$ ls
The Art of Computer Programming Vol 1
The Art of Computer Programming Vol 2

Reading Queues

Track what you’re reading, watching, or working through:

# Bookmarks
btk queue add 42 --priority high
btk queue next
btk queue progress 42 --percent 75
btk queue estimate-times  # Auto-estimate from content length

# Books
ebk queue add "Gödel, Escher, Bach"
ebk queue next
ebk queue list

LLM Integration

All three integrate with LLMs for tagging, summarization, and search:

# Auto-tag using content analysis
btk content auto-tag --all
ctk auto-tag --model ollama/llama3
ebk enrich 42  # Enhance metadata with LLM

# Natural language queries
ctk say "summarize my conversations about Rust"
btk ask "find articles about distributed systems"
ebk similar "Gödel, Escher, Bach"  # Semantic similarity

Network Analysis

Find relationships in your data:

# CTK: Conversation networks
ctk net embeddings --all
ctk net similar 42
ctk net clusters
ctk net central  # Most connected conversations
ctk net outliers  # Isolated conversations

# BTK: Bookmark graphs
btk graph build
btk graph analyze

Web Servers

Browse your archives in a web UI:

Everything is a File: Virtual Filesystems for CLI Data Tools

October 20, 2025

I had a bookmark manager. Then an ebook library manager. Then a chat history manager. Each started with the standard CRUD CLI:

btk add https://example.com --tags python,tutorial
btk list --tag python
btk search "async"
btk delete 1234

ebk import book.pdf --author "Knuth"
ebk list --author Knuth
ebk search "algorithms"

This works fine until you have 10,000+ bookmarks organized with hierarchical tags like programming/python/async, research/ml/transformers, work/clients/acme. Your ebook library has similar structure. Your exported chat conversations from Claude, ChatGPT, and Copilot are piling up.

Traditional CRUD commands become unwieldy:

btk list --tag programming/python/async/io --format json | jq '.[].title'
ebk list --category "Computer Science/Algorithms/Graph Theory" --limit 50
ctk search "machine learning" --source ChatGPT --date-from 2024-01-01

Each command requires precise arguments. Each tool has different flag conventions. You can’t navigate your data. You can only query it. And queries require knowing exactly what you’re looking for.

The insight: everything is a file

When I have thousands of source files organized in directories, I don’t run:

list-files --path /src/components/auth --extension .tsx

I run:

cd src/components/auth
ls *.tsx

The difference matters. With a filesystem, I can navigate incrementally (cd from general to specific), explore (ls to see what’s there), compose (cat file | grep pattern | wc -l), and use familiar tools (find, grep, xargs, pipes, redirection).

What if my bookmarks, ebooks, and chat histories were filesystems?

The pattern

Over the past year, I built six Python tools that all follow the same architecture:

Tool	Domain	VFS Root Structure
btk	Bookmarks	`/bookmarks/`, `/tags/`, `/recent/`, `/domains/`, `/unread/`, `/popular/`
ebk	Ebook library	`/books/`, `/authors/`, `/series/`, `/subjects/`, `/recent/`, `/unread/`
ctk	Chat conversations	`/conversations/`, `/sources/`, `/topics/`, `/starred/`, `/recent/`
ghops	Git repositories	`/repos/`, `/languages/`, `/topics/`, `/stars/`, `/recent/`
infinigram	N-gram models	`/datasets/`, `/models/`, `/corpora/`
AlgoTree	Tree structures	`/nodes/`, `/paths/`, `/subtrees/`

Each tool provides:

A stateless CLI for scripting: btk bookmark add URL, ebk import book.pdf
An interactive shell with a virtual filesystem: btk shell, ebk shell, ctk chat
POSIX-like commands: cd, ls, pwd, cat, mv, cp, rm, find, grep
Unix pipeline support: most commands output JSONL by default for piping

The interesting part is the shell.

Navigating 10,000 bookmarks

Live recording captured with asciinema. You can pause, copy text, and replay. The entire recording is 78KB of text.

EBK: Ebook Toolkit

October 13, 2025

Your books represent decades of accumulated knowledge. Technical references, formative texts, research that shaped your thinking. They deserve better than scattered files on a hard drive with inconsistent metadata and no way to search across them.

EBK treats your ebook library as a queryable, searchable knowledge base. It’s part of the Long Echo toolkit: tools for preserving your digital intellectual life in formats you control.

The Core Abstraction

At its heart, EBK is a SQLAlchemy + SQLite database with a normalized schema. Everything else (CLI, AI features, exports) is layered on top. This means your library metadata is always queryable with standard tools, even if EBK itself disappears.

# Works even without EBK installed
sqlite3 library.db "SELECT title, author FROM books WHERE favorite = 1"

What It Does

# Initialize and import
ebk db-init ~/my-library
ebk db-import ~/Documents/book.pdf ~/my-library
ebk db-import-calibre ~/Calibre/Library ~/my-library

# Search with FTS5 full-text search
ebk db-search "quantum computing" ~/my-library

# Field-specific queries
ebk db-search "title:Python author:Knuth tag:programming" ~/my-library

Behind a simple import, EBK automatically extracts text from PDFs (PyMuPDF with pypdf fallback) and EPUBs, generates text chunks for semantic search, computes SHA256 hashes for deduplication, extracts covers, and indexes everything in FTS5.

Deduplication

Same file (same hash) gets skipped. Same book in a different format gets added as an additional format. Different book gets imported as new. Books are stored in hash-prefixed directories for scalability.

AI Enrichment

EBK can use LLMs to auto-generate tags, categories, and descriptions for books with sparse metadata:

ebk enrich 42  # Enhance metadata with LLM

Semantic search finds books by meaning, not just keywords:

results = lib.semantic_search(
    "explaining complex mathematical concepts simply",
    threshold=0.7
)

Uses vector embeddings when available, TF-IDF fallback for offline use.

Knowledge Graphs

Using NetworkX, EBK can extract concept relationships across your library:

graph = lib.build_knowledge_graph(extract_entities=True)
graph.visualize(output="library_knowledge.html")

This reveals connections you didn’t know existed. “These books about functional programming also discuss category theory.”

Fluent Python API

from ebk import Library

lib = Library.open("~/ebooks")
results = (lib.query()
    .where("language", "en")
    .where("date", "2020", ">=")
    .where("subjects", "Python", "contains")
    .order_by("title")
    .take(10)
    .execute())

Export

Multiple formats for different needs:

ebk export hugo ~/library ~/hugo-site --organize-by subject --include-covers
ebk export-dag ~/library ~/output  # Navigable symlink directory structure

The Hugo export creates a browsable website. The DAG export creates a tag-based directory structure where books appear via symlinks under multiple categories. Both work without EBK installed.