April 24, 2026 Reinforcement Learning: An Introduction Notes The RL bible. Bandits to policy gradients to planning.