Deep Reinforcement Learning from Human Preferences

Christiano, Leike, Brown, Martic, Legg, Amodei

paper completed ai-ml

Year 2017

External Link https://arxiv.org/pdf/1706.03741

RLHF reward model human preferences from:language-models

Notes

Foundational RLHF paper. Learning reward models from human comparisons.

View Resource All Media More in Ai-Ml