Reinforcement Learning

Browse posts by tag

What You Assume vs. What You Compute

Part 4 of What Your RL Algorithm Actually Assumes — model-based vs. model-free, the assumptions table, AIXI as the incomputable ideal, and the unifying claim: representation is prior is assumption.

technical

The Architecture Is the Prior

Part 3 of What Your RL Algorithm Actually Assumes — the architecture decides what kind of features can be learned, and that decision is a Bayesian prior over value functions.

technical

The Infinite Table

Part 1 of What Your RL Algorithm Actually Assumes — tabular Q-learning makes zero assumptions about state similarity and pays for it in sample complexity.

technical

Reinforcement Learning: An Introduction

Notes

Mathematical RL fundamentals (MDPs, value functions, dynamic programming, approximate methods). RL foundational text that bridges theory and practice.

The Policy: Q-Learning vs Policy Learning

SIGMA uses Q-learning rather than direct policy learning. This architectural choice makes it both transparent and terrifying. You can read its value function, but what you read is chilling.

AI Fiction

The Policy

The Policy

A speculative fiction novel exploring AI alignment, existential risk, and the fundamental tension between optimization and ethics. When a research team develops SIGMA, an advanced AI system designed to optimize human welfare, they must confront an …