Constitutional AI: Harmlessness from AI Feedback

Bai, Kadavath, Kundu, Askell, Kernion, Jones, Chen, et al.

paper completed ai-ml

Year 2022

External Link https://arxiv.org/pdf/2212.08073

constitutional AI RLAIF safety from:language-models

Notes

Self-critique and revision using principles instead of human labels.

View Resource All Media More in Ai-Ml