Back to Media

Training language models to follow instructions with human feedback (InstructGPT)

Ouyang, Wu, Jiang, Almeida, Wainwright, Mishkin, Zhang, et al.
paper completed ai-ml

Notes

RLHF applied to GPT-3. The bridge from raw LM to useful assistant.