April 24, 2026
Training language models to follow instructions with human feedback (InstructGPT)
Notes
RLHF applied to GPT-3. The bridge from raw LM to useful assistant.
Browse posts by tag
RLHF applied to GPT-3. The bridge from raw LM to useful assistant.