Q-Learning

Browse posts by tag

The Infinite Table

Part 1 of What Your RL Algorithm Actually Assumes — tabular Q-learning makes zero assumptions about state similarity and pays for it in sample complexity.

technical

The Policy: Q-Learning vs Policy Learning

SIGMA uses Q-learning rather than direct policy learning. This architectural choice makes it both transparent and terrifying. You can read its value function, but what you read is chilling.

AI Fiction