Papers #2-3: arxiv.org/abs/2402.10210 and arxiv.org/abs/2405.00675 from the incredible
@quanquangu.bsky.social. I really like how they explore new techniques for RLHF
Papers #2-3: arxiv.org/abs/2402.10210 and arxiv.org/abs/2405.00675 from the incredible
@quanquangu.bsky.social. I really like how they explore new techniques for RLHF
Pretraining will only end once we find the optimal scaling law.
To better interpret the plot, draw a horizontal line representing a specific target validation loss. Find the points where this line intersects the curves for AdamW and MARS, which will allow you to determine how much speedup, in terms of training tokens, MARS achieves compared to AdamW.
Just added you.
With the delivery of MARS complete, the focus now shifts to delivering new architectures.
Just added you! Welcome!
Just added you.
Just added you.
Just added you!
Just added you!
Just added you.
This Thanksgiving, I want to express my heartfelt gratitude to all the students, colleagues, and collaborators who have contributed to the success of SPIN, SPPO, DPLM, GPM, MARS, and many other projects. Your hard work and dedication continue to be truly inspiring.
Just added you!
Just added you!
Just added you.
Anyone using their real name and interested is welcome!
Just added you. Welcome!
MARS is a unified framework that can be integrated with various precondition techniques. So it can be applied to PSGD. I believe @hessianfree.bsky.social has implemented MARS-PSGD.
Just added you!
Just added you.
Done!
Just added you.
Just added you!
Just added you!
Please reply to this message or DM me if youโd like to be added!
Just added you!
Have added both of you. Feel free to recommend other people.
Tulu 3 SFT mix trending on HuggingFace :D , next step make preferences and RL datasets more accessible.
OLMo 2 is out ๐ฅณ 7B and 13B trained on 5T tokens, and meticulousy instruction tuned using Tulu 3 recipe.
Simply the best fully open models yet.
Really proud of the work & the amazing team at
@ai2.bsky.social
Just added you there.