Skip to main content

metafunctor Research · Coding

Home
Posts
Series
Academic
Creative
- Writing
- Media
Projects
About
Search

Home
/ Tags
/ Human Preferences

Human Preferences

Browse posts by tag

Sort by:

April 24, 2026

Deep Reinforcement Learning from Human Preferences

Notes

Foundational RLHF paper. Learning reward models from human comparisons.

No tags found matching your search.

metafunctor

Research engineer and computer scientist specializing in machine learning, statistical computing, and open source software development.

Content

Posts Papers Series Publications Writing

Code

Projects GitHub PyPI

Connect

About Contact RSS Feed

© 2026 Alex Towell. All rights reserved.