I had a bookmark manager. Then an ebook library manager. Then a chat history manager. Each started with the standard CRUD CLI:
btk add https://example.com --tags python,tutorial
btk list --tag python
btk search "async"
btk delete 1234
ebk import book.pdf --author "Knuth"
ebk list --author Knuth
ebk search "algorithms"
This works fine until you have 10,000+ bookmarks organized with hierarchical tags like programming/python/async, research/ml/transformers, work/clients/acme. Your ebook library has similar structure. Your exported chat conversations from Claude, ChatGPT, and Copilot are piling up.
Traditional CRUD commands become unwieldy:
btk list --tag programming/python/async/io --format json | jq '.[].title'
ebk list --category "Computer Science/Algorithms/Graph Theory" --limit 50
ctk search "machine learning" --source ChatGPT --date-from 2024-01-01
Each command requires precise arguments. Each tool has different flag conventions. You can’t navigate your data. You can only query it. And queries require knowing exactly what you’re looking for.
The insight: everything is a file
When I have thousands of source files organized in directories, I don’t run:
list-files --path /src/components/auth --extension .tsx
I run:
cd src/components/auth
ls *.tsx
The difference matters. With a filesystem, I can navigate incrementally (cd from general to specific), explore (ls to see what’s there), compose (cat file | grep pattern | wc -l), and use familiar tools (find, grep, xargs, pipes, redirection).
What if my bookmarks, ebooks, and chat histories were filesystems?
The pattern
Over the past year, I built six Python tools that all follow the same architecture:
| Tool | Domain | VFS Root Structure |
|---|---|---|
| btk | Bookmarks | /bookmarks/, /tags/, /recent/, /domains/, /unread/, /popular/ |
| ebk | Ebook library | /books/, /authors/, /series/, /subjects/, /recent/, /unread/ |
| ctk | Chat conversations | /conversations/, /sources/, /topics/, /starred/, /recent/ |
| ghops | Git repositories | /repos/, /languages/, /topics/, /stars/, /recent/ |
| infinigram | N-gram models | /datasets/, /models/, /corpora/ |
| AlgoTree | Tree structures | /nodes/, /paths/, /subtrees/ |
Each tool provides:
- A stateless CLI for scripting:
btk bookmark add URL,ebk import book.pdf - An interactive shell with a virtual filesystem:
btk shell,ebk shell,ctk chat - POSIX-like commands:
cd,ls,pwd,cat,mv,cp,rm,find,grep - Unix pipeline support: most commands output JSONL by default for piping
The interesting part is the shell.
Navigating 10,000 bookmarks
Live recording captured with asciinema. You can pause, copy text, and replay. The entire recording is 78KB of text.
The old way
$ btk search "python async" --tag programming --limit 10
# Returns JSON blob... now what?
$ btk list --tag "programming/python/async"
# Hope I remembered the exact tag path
$ btk bookmark get 4095 --format json | jq '.tags'
# One bookmark at a time, verbose
The VFS way
$ btk shell
__ __ __
/ /_ / /_/ /__
/ __ \/ __/ //_/
/ /_/ / /_/ ,<
/_.___/\__/_/|_| v0.7.1
Bookmark Toolkit - Virtual Filesystem Shell
btk:/$ ls
bookmarks/ (10,247) All bookmarks
tags/ Tag hierarchy
recent/ Time-based navigation
domains/ Browse by domain
unread/ (2,431) Never visited
popular/ (100) Most visited
broken/ (14) Dead links
starred/ (156) Starred bookmarks
btk:/$ cd tags/programming/python
btk:/tags/programming/python$ ls
async/ (87)
web/ (124)
data/ (156)
ml/ (89)
testing/ (45)
btk:/tags/programming/python/async$ ls | head -5
4095 5234 6012 6891 7234
btk:/tags/programming/python/async$ cat 4095/title
Real Python - Async IO in Python: A Complete Walkthrough
btk:/tags/programming/python/async$ cat 4095/url
https://realpython.com/async-io-python/
btk:/tags/programming/python/async$ star 4095
★ Starred bookmark #4095
btk:/tags/programming/python/async$ cd /recent/today/added
btk:/recent/today/added$ ls
8901 8902 8903 8904
btk:/recent/today/added$ tag 8901 8902 8903 todo
✓ Tagged 3 bookmarks
No flag memorization. Incremental exploration. Context-aware commands. It’s just directories and files.
Same pattern, different data
Ebooks (ebk)
ebk:/$ cd subjects/Computer\ Science/Algorithms
ebk:/subjects/Computer Science/Algorithms$ ls
Introduction to Algorithms.pdf
The Algorithm Design Manual.pdf
Algorithms (Sedgewick).pdf
ebk:/subjects/Computer Science/Algorithms$ cat "Introduction to Algorithms.pdf"/metadata
Title: Introduction to Algorithms
Authors: Cormen, Leiserson, Rivest, Stein
ISBN: 978-0262033848
Pages: 1312
Rating: 5/5
ebk:/subjects/Computer Science/Algorithms$ rate * 5
✓ Rated 3 books
Chat history (ctk)
ctk:/$ cd sources/ChatGPT
ctk:/sources/ChatGPT$ ls | wc -l
423
ctk:/sources/ChatGPT$ cd /topics/machine-learning
ctk:/topics/machine-learning$ ls
conv_a1b2c3 conv_d4e5f6 conv_g7h8i9
ctk:/topics/machine-learning$ show conv_a1b2c3
[Shows conversation tree with messages]
ctk:/topics/machine-learning$ star conv_a1b2c3
★ Starred conversation
ctk:/topics/machine-learning$ cd /starred
ctk:/starred$ export --format markdown > starred_ml_convos.md
✓ Exported 5 starred conversations to starred_ml_convos.md
Git repositories (ghops)
ghops:/$ cd languages/Python
ghops:/languages/Python$ ls
btk/ ebk/ ctk/ ghops/ infinigram/ AlgoTree/
ghops:/languages/Python$ cd btk
ghops:/languages/Python/btk$ status
Branch: master
Commits ahead: 0
Uncommitted changes: 0
Last commit: Release v0.7.1 (2 days ago)
What makes it work
After building six of these, the patterns are clear.
Stateless CLI + stateful shell
Every tool has both interfaces. The CLI is for automation and scripting. The shell is for humans.
# CLI (stateless, scriptable)
btk bookmark add https://example.com --tags python,tutorial
ebk import book.pdf --author "Knuth"
ctk export --format jsonl > training.jsonl
# Shell (stateful, exploratory)
btk shell
cd tags/python
star *
Dynamic virtual directories
Traditional filesystems are static. The VFS exposes computed views:
btk:/$ ls
unread/ (2,431) # SELECT * WHERE visit_count = 0
popular/ (100) # SELECT * ORDER BY visit_count DESC LIMIT 100
broken/ (14) # SELECT * WHERE reachable = false
recent/today/added/ # SELECT * WHERE added >= TODAY
These “directories” don’t exist on disk. They’re queries. But they feel like directories, which is the point.
Context-aware commands
Commands understand where you are:
btk:/bookmarks/4095$ cat title
# Shows title of bookmark 4095
btk:/tags/python$ star *
# Stars all Python-tagged bookmarks
btk:/recent/today/added$ tag * review
# Tags today's additions
btk:/broken$ rm *
# Removes all broken bookmarks
The current path becomes implicit context. No need to repeat IDs or filters.
Hierarchical tags as directories
This is the killer feature. Hierarchical tags map directly to filesystem paths.
# Tag with hierarchy
btk tag 4095 programming/python/async/io
# Navigate the hierarchy
btk:/$ cd tags/programming
btk:/tags/programming$ ls
python/ javascript/ rust/ go/
btk:/tags/programming$ cd python/async
btk:/tags/programming/python/async$ ls
io/ frameworks/ patterns/
# Bulk operations on a hierarchy
btk:/tags/programming/python$ star */advanced/*
Compare with flat tags: python, python-async, python-async-io, python-web, python-web-django. Hierarchy gives you free navigation and organization.
JSONL by default
All commands output newline-delimited JSON by default:
# Pipe to jq, grep, awk, anything
btk list | jq 'select(.stars == true)'
ebk status | grep "rating: 5"
ctk search "python" | jq '.id' | xargs ctk export --ids
# Pretty-print for humans
btk list --pretty
This makes the tools composable with the Unix ecosystem. JSONL is streamable, appendable, grepable, and robust (one malformed record doesn’t break the file).
Implementation
The architecture is straightforward. Each tool shares the same stack:
- Database layer (SQLAlchemy + SQLite): normalized schema, FTS5 search, efficient indexing
- VFS layer (Python
cmd.Cmd): path parsing, context detection, command routing - Command layer: context-aware implementations of
do_ls(),do_cd(),do_cat(), etc. - CLI layer (Typer/argparse): stateless commands for scripting, JSONL output
The core of it is context detection:
def _get_context(self):
"""Determine what 'directory' we're in."""
if self.current_path == "/":
return {'type': 'root'}
parts = self.current_path.strip('/').split('/')
if parts[0] == 'tags':
# /tags/programming/python
tag_path = '/'.join(parts[1:])
bookmarks = self.db.filter_by_tag_prefix(tag_path)
return {'type': 'tag', 'tag': tag_path, 'bookmarks': bookmarks}
elif parts[0] == 'recent':
# /recent/today/added
period = parts[1] # 'today'
activity = parts[2] if len(parts) > 2 else 'visited'
bookmarks = filter_by_time_and_activity(period, activity)
return {'type': 'recent_activity', 'period': period,
'activity': activity, 'bookmarks': bookmarks}
elif parts[0] == 'unread':
# /unread - smart collection
bookmarks = self.db.filter(visit_count=0)
return {'type': 'smart_collection', 'name': 'unread',
'bookmarks': bookmarks}
elif parts[0] == 'bookmarks' and len(parts) == 2:
# /bookmarks/4095
bookmark_id = int(parts[1])
bookmark = self.db.get(bookmark_id)
return {'type': 'bookmark', 'bookmark_id': bookmark_id,
'bookmark': bookmark}
Once you know the context, commands adapt:
def do_ls(self, args):
"""List items in current directory."""
context = self._get_context()
if context['type'] == 'root':
self._ls_root()
elif context['type'] == 'tag':
self._ls_tag(context['tag'], context['bookmarks'])
elif context['type'] == 'recent_activity':
self._ls_bookmarks(context['bookmarks'])
elif context['type'] == 'smart_collection':
self._ls_collection(context['name'], context['bookmarks'])
elif context['type'] == 'bookmark':
self._ls_bookmark(context['bookmark'])
Context detection + polymorphic commands. That’s the whole trick.
Why it matters
Traditional CLIs force you to remember exact syntax, construct precise queries, and process JSON blobs. VFS interfaces let you explore incrementally, discover what exists, and operate on context.
It matches how humans think: spatial navigation over query construction. We already know filesystems. We already know cd, ls, grep, find. The VFS pattern lets you reuse that knowledge for any hierarchical data.
The tools
All six are open source and on PyPI:
- btk (Bookmark Toolkit): github.com/queelius/btk
- ebk (eBook Manager): github.com/queelius/ebk
- ctk (Conversation Toolkit): github.com/queelius/ctk
- ghops (Git Repository Manager): github.com/queelius/ghops
- infinigram (N-gram Models): github.com/queelius/infinigram
- AlgoTree (Tree Structures): github.com/queelius/AlgoTree
If you build CLI tools for hierarchical data, consider the VFS pattern. Your users already know cd and ls. Why make them learn 47 flags?
Technical notes
Recording shell sessions
The interactive demo uses asciinema:
asciinema rec btk-demo.cast
btk shell
# ... do your demo ...
exit
The .cast file is pure text (JSON), usually a few KB. You get interactive playback via asciinema-player, copy-pasteable text, and 78KB for several minutes of demo instead of megabytes for a GIF or video. This is better than screen recordings for terminal demos.
Test coverage
All six tools have comprehensive test suites:
- btk: 515 tests (53% shell coverage, 23% CLI coverage)
- ghops: 138 tests, 86% coverage
- infinigram: 36 tests with benchmarks
- AlgoTree: 197 tests, 86% coverage
The VFS pattern is highly testable. Each component (path parsing, context detection, command handlers) is isolated and pure. Mock the database, test the rest.
Discussion