SLUUG Talk: Demystifying Large Language Models on Linux
Talk for the St. Louis Unix Users Group about running and understanding Large Language Models on Linux.
Browse posts by tag
Talk for the St. Louis Unix Users Group about running and understanding Large Language Models on Linux.
A tiny LLM in the browser, mixed at sample time with a token-level n-gram trained on every word I have published. Result is mediocre. Architecture is interesting. Notes on what worked, what didn't, and what would make it work.
I tried closing the loop on retrieval-augmented generation with TD learning. A trivial baseline matched the full method. Here's the lesson.
The most dramatic possibility in AI might arrive through the most mundane mechanism. Not a beam of sacred light. A sufficiently good build system.
A guided tour through my open-source ecosystem: encrypted search theory, statistical reliability, Unix-philosophy CLI tools, AI research, and speculative fiction. How the projects connect and where to start.
What if reasoning traces could learn their own usefulness? A simple RL framing for trace memory, and why one reward signal is enough.
The classical AI curriculum teaches rational agents as utility maximizers. The progression from search to RL to LLMs is really about one thing: finding representations that make decision-making tractable.
A response to the 'boring stack' discourse. Why CLI-first, standards-based development is even more boring (and more future-proof) than you think.
A tool that converts source code repositories into structured, context-window-optimized Markdown for LLMs, with intelligent summarization and importance scoring.
On research strategy, what complex networks reveal about how we think through AI conversations, and building infrastructure for the next generation of knowledge tools.
A plugin-based toolkit for managing AI conversations from multiple providers. Import, store, search, and export conversations in a unified tree format. Built for the Long Echo project.
A logic programming system that alternates between wake and sleep phases, using LLMs for knowledge generation during wake and compression-based learning during sleep.
A mathematical framework that treats language models as algebraic objects with compositional structure.
In 2023 I drafted a paper on routing between a large and small LLM via KL-divergence thresholds. Speculative decoding had already solved the problem more rigorously. Here is the post-mortem.
Applying Monte Carlo Tree Search to large language model reasoning, with a formal specification of the algorithm.
Using GMM clustering to improve retrieval in topically diverse knowledge bases
What if LLMs could remember their own successful reasoning? A simple experiment in trace retrieval, and why 'latent' is the right word.
The evolution of neural sequence prediction, and how it connects to classical methods
I built a home lab from spare parts and water-damaged hardware for local LLM experimentation. CPU-only inference is slow, but you learn things cloud APIs hide.
Encountering ChatGPT during cancer treatment and recognizing the Solomonoff connection. Language models as compression, prediction as intelligence. A personal inflection point reconnecting with AI research after years in survival mode.