April 24, 2026Reinforcement Learning: An IntroductionNotesThe RL bible. Bandits to policy gradients to planning.