First draft online version of The RLHF Book is DONE. Recently I've been creating the advanced discussion chapters on everything from Constitutional AI to evaluation and character training, but I also sneak in consistent improvements to the RL specific chapter.
rlhfbook.com
16.04.2025 19:01
π 122
π 19
π¬ 2
π 3
At ICLR 2025 in Singapore, my co-authors and I presented two papers on RL. Feel free to let us know of any feedback and let me know if you'd like to chat!
- openreview.net/forum?id=AOl...
- openreview.net/forum?id=AOl...
26.04.2025 01:51
π 2
π 0
π¬ 0
π 0
Postdoctoral Researcher, Monetization (PhD)
Meta's mission is to build the future of human connection and the technology that makes it possible.
Topics of interest include offline RL, post-training large language models with RLHF, and long-term recommendation systems. If youβre interested, please email me and/or apply here: www.metacareers.com/jobs/1142270...
17.03.2025 13:59
π 2
π 0
π¬ 0
π 0
Our team at Meta is hiring a postdoc researcher! Our group conducts both fundamental and applied research in reinforcement learning, with a focus on applications in Meta's advertising systems.
17.03.2025 13:59
π 3
π 1
π¬ 1
π 0
ASOS Digital Experiments Dataset
A novel dataset that can support the end-to-end design and running of Online Controlled Experiments (OCE) with adaptive stopping.
Hosted on the Open Science Framework
Thereβs one from ASOS.com that provides A/B test data over time (across many experiments, each with several arms).
Dataset: osf.io/64jsb/
Paper: arxiv.org/abs/2111.10198
We used it in a paper to benchmark an AE method. But Iβd also love to know of other alternatives out there.
21.02.2025 05:27
π 8
π 0
π¬ 1
π 0
Given a high-quality verifier, language model accuracy can be improved by scaling inference-time compute (e.g., w/ repeated sampling). When can we expect similar gains without an external verifier?
New paper: Self-Improvement in Language Models: The Sharpening Mechanism
arxiv.org/abs/2412.01951
14.12.2024 16:10
π 41
π 6
π¬ 3
π 0
Reinforcement Learning: An Overview
This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement learning and sequential decision making, covering value-based RL, policy-gradient methods, model-based met...
An updated intro to reinforcement learning by Kevin Murphy: arxiv.org/abs/2412.05265! Like their books, it covers a lot and is quite up to date with modern approaches. It also is pretty unique in coverage, I don't think a lot of this is synthesized anywhere else yet
09.12.2024 14:27
π 270
π 73
π¬ 9
π 5
I know one of the organizers is @eugenevinitsky.bsky.social. They did a great job and organized a very enjoyable conference.
10.12.2024 08:18
π 1
π 0
π¬ 1
π 0
I collected some folk knowledge for RL and stuck them in my lecture slides a couple weeks back: web.mit.edu/6.7920/www/l... See Appendix B... sorry, I know, appendix of a lecture slide deck is not the best for discovery. Suggestions very welcome.
27.11.2024 13:36
π 113
π 17
π¬ 3
π 3
Want to learn / teach RL? β¨
Check out new book draft:
Reinforcement Learning - Foundationsβ¨sites.google.com/view/rlfound...
W/ Shie Mannor & Yishay Mansour
This is a rigorous first course in RL, based on our teaching at TAU CS and Technion ECE.
25.11.2024 12:08
π 154
π 34
π¬ 4
π 4
New paper: Do social media algorithms shape affective polarization?
We ran a field experiment on X/Twitter (N=1,256) using LLMs to rerank content in real-time, adjusting exposure to polarizing posts. Result: Algorithmic ranking impacts feelings toward the political outgroup! π§΅β¬οΈ
25.11.2024 20:32
π 808
π 214
π¬ 32
π 52
The RL (and some non-RL folks) starter pack is almost full. Pretty clear that the academic move here has succeeded
go.bsky.app/3WPHcHg
18.11.2024 20:30
π 104
π 33
π¬ 12
π 3