Our latest blog entry discusses the impact of #PreferenceLearning in #LLMs, wraps up #MPREF2025, and gives an outlook on this year’s #PreferenceHandling events #DA2PL2026 and #ADT2026. ADT will be held in Paris on Nov. 16-18. Abstracts are due by May 4, papers by May 11.
mpref.org/the-impact-o...
💻 Project page:
ukplab.github.io/arxiv2025-ex...
🔁 Both papers build on HAI-Co2 described in this position paper:
direct.mit.edu/coli/article...
#NLP #NLProc #LLMs #Evaluation #PreferenceLearning #ScientificWriting #HumanAI #HaiCo2 @tuda.bsky.social @cs-tudarmstadt.bsky.social
7/ Materials:
Course website / syllabus / notes: web.stanford.edu/class/cs329h/
Living Textbook: mlhp.stanford.edu
#AIAlignment #MachineLearning #ResponsibleAI #Stanford #PreferenceLearning
Algorithm Finds Best Policies with Trajectory Preference Feedback
PSPL merges offline preference data with online exploration, using Thompson sampling to give Bayesian simple‑regret bounds for RL. Read more: getnews.me/algorithm-finds-best-pol... #preferencelearning #reinforcementlearning