Skip to main content

metafunctor Research · Coding

Posts
Search

Home
/ Tags
/ Reward Model

Reward Model

Browse posts by tag

Sort by:

April 24, 2026

Deep Reinforcement Learning from Human Preferences

Notes

Foundational RLHF paper. Learning reward models from human comparisons.

No tags found matching your search.

metafunctor

Research engineer and computer scientist specializing in machine learning, statistical computing, and open source software development.

Content

Posts Papers Series Publications Writing

Code

Projects GitHub PyPI

Connect

About Contact RSS Feed

© 2026 Alex Towell. All rights reserved.

Privacy Policy Terms of Use