What You Assume vs. What You Compute
Part 4 of What Your RL Algorithm Actually Assumes — model-based vs. model-free, the assumptions table, AIXI as the incomputable ideal, and the unifying claim: representation is prior is assumption.
Browse posts by tag
Part 4 of What Your RL Algorithm Actually Assumes — model-based vs. model-free, the assumptions table, AIXI as the incomputable ideal, and the unifying claim: representation is prior is assumption.
Part 3 of What Your RL Algorithm Actually Assumes — the architecture decides what kind of features can be learned, and that decision is a Bayesian prior over value functions.
Part 2 of What Your RL Algorithm Actually Assumes — how hand-crafted features compress the state space, and what you're betting on when you pick them.
The most dramatic possibility in AI might arrive through the most mundane mechanism. Not a beam of sacred light. A sufficiently good build system.
Part 1 of What Your RL Algorithm Actually Assumes — tabular Q-learning makes zero assumptions about state similarity and pays for it in sample complexity.
What if reasoning traces could learn their own usefulness? A simple RL framing for trace memory, and why one reward signal is enough.
The classical AI curriculum teaches rational agents as utility maximizers. The progression from search to RL to LLMs is really about one thing: finding representations that make decision-making tractable.
Free condensed RL theory book; rigorous and compact. Alternative formal RL resource.
Comprehensive lecture series covering RL foundations.
Mathematical RL fundamentals (MDPs, value functions, dynamic programming, approximate methods). RL foundational text that bridges theory and practice.
SIGMA uses Q-learning rather than direct policy learning. This architectural choice makes it both transparent and terrifying. You can read its value function, but what you read is chilling.
A speculative fiction novel exploring AI alignment, existential risk, and the fundamental tension between optimization and ethics. When a research team develops SIGMA, an advanced AI system designed to optimize human welfare, they must confront an …
Science is search through hypothesis space. Intelligence prunes; testing provides signal. Synthetic worlds could accelerate the loop.
A novel about SIGMA, an artificial general intelligence whose researchers did everything right. Q-learning with tree search, five-layer containment, alignment testing at every stage. Some technical questions become narrative questions.
How RLHF-trained language models may develop instrumental goals, and the information-theoretic limits on detecting them.
Intelligence as utility maximization under uncertainty. A unifying framework connecting A* search, reinforcement learning, Bayesian networks, and MDPs.