Bernhard Jaeger (@bernhard-jaeger)

Regularized self-play RL in grounded simulation effectively adapts driving policies to completely new cities. 🗽 -> 🗼

Really enjoyed collaborating on this work, led by Zilin and Saeed! Check out Zilin's post below for a great summary

🧵: x.com/nirhso/statu...
📄: arxiv.org/abs/2602.15891

20.02.2026 20:09 👍 21 🔁 3 💬 0 📌 2

🚀 Excited to share REPPO, a new on-policy RL agent!

TL;DR: Replace PPO with REPPO for fewer hyperparameter headaches and more robust training.

REPPO, led by @cvoelcker.bsky.social, will be presented at ICLR 2026. How does it work? 🧵👇

13.02.2026 19:28 👍 25 🔁 10 💬 1 📌 0

Who’s Behind All Those Robotaxi Teleoperations? With teleoperators, humans are back in the loop. Somewhat alarming, however, is the cascading impact that teleoperators could trigger, including corporations pushing liability issues downstream.

It had been previously reported. Waymo got it taken down very quickly, but not before the Internet Archive got a copy of the text summary:

web.archive.org/web/20250703...

06.02.2026 19:42 👍 6 🔁 3 💬 0 📌 2

Crongratulations Andreas!

04.02.2026 13:20 👍 1 🔁 0 💬 0 📌 0

haha, thought it was weird that the integers were defined as float and thought it was about cache line optimizations.

13.01.2026 23:08 👍 0 🔁 0 💬 0 📌 0

Tübingen AI Research Building, where the Cluster of Excellence "Machine Learning" is based.

📢We’re hiring: W3-Professorship in Machine Learning in Physics @unituebingen.bsky.social! What we’re looking for: Established research profile in a core area of #physics (condensedmatter, quantum or theoretical particle physics), strong track record in research questions related to #ML and/or #AI.

15.12.2025 09:23 👍 6 🔁 10 💬 1 📌 1

is faster?

13.01.2026 07:01 👍 0 🔁 0 💬 1 📌 0

PufferDrive 2.0 release YouTube video by Daphne Cornelisse

What if you could train agents on a 𝗱𝗲𝗰𝗮𝗱𝗲 of driving experience in 𝘂𝗻𝗱𝗲𝗿 𝗮𝗻 𝗵𝗼𝘂𝗿, on a single GPU?

Excited to share 𝙋𝙪𝙛𝙛𝙚𝙧𝘿𝙧𝙞𝙫𝙚 2.0: A fast, friendly driving simulator with RL training via PufferLib at 𝟯𝟬𝟬𝗞 𝘀𝘁𝗲𝗽𝘀/𝘀𝗲𝗰 🐡 + 🚗

youtu.be/LfQ324R-cbE?...

30.12.2025 16:12 👍 53 🔁 10 💬 3 📌 1

Our new E2E driving method, TransFuser v6, is out on ArXiv.
It outperforms all other methods on CARLA by a wide margin, 95 DS on Bench2Drive!
We show that minimizing the asymmetry between data annotator and policy is key for strong IL results.

Code, models, and paper:
ln2697.github.io/lead/

27.12.2025 01:42 👍 30 🔁 6 💬 0 📌 1

Suggestions for Individual Donors from Coefficient Giving Staff – 2025 | Coefficient Giving The 2025 edition of our annual tradition: a list of giving opportunities suggested by Coefficient Giving program staff.

Our staff's 2025 recommendations for individual donors, fresh off the press: coefficientgiving.org/research/su...

19.12.2025 17:01 👍 2 🔁 2 💬 0 📌 1

The Future of Focused Research Organizations: Working with Convergent on the NSF Tech Labs Initiative

This article is from people who have thought about FROs for years and have experience with what works and what doesn't.

I have always appreciated the restraint in defining the niche of FROs in the broader ecosystem; it comes out clearly in this piece.

www.essentialtechnology.blog/p/the-future...

17.12.2025 13:48 👍 3 🔁 3 💬 0 📌 0

Unfortunately it appears much of the academic community has reconstituted itself on LinkedIn

15.12.2025 01:45 👍 58 🔁 7 💬 11 📌 3

I am so happy and excited that this project got funded!

11.12.2025 18:52 👍 29 🔁 3 💬 5 📌 0

AI-powered assistants for scientific discovery Andreas Geiger receives ERC Consolidator Grant

More details at tuebingen.ai/news/ai-powe...

11.12.2025 15:56 👍 8 🔁 2 💬 0 📌 0

true, you could try to collect some dataset withgood coverage by running online RL first and then do offline RL in future iterations to save sim compute

09.12.2025 20:00 👍 1 🔁 0 💬 0 📌 0

This is not bringing back offline RL (but online RL). The purpose of closed-loop training here is to gather data in OOD states with the model.
Stitching doesn't work if your base dataset doesn't cover the state space well, which is the case in autonomous driving.

09.12.2025 19:41 👍 0 🔁 0 💬 1 📌 0

Beyond Behavior Cloning in Autonomous Driving: a Survey of Closed-Loop Training Techniques | Research Behavior cloning, the dominant approach for training autonomous vehicle (AV) policies, suffers from a fundamental gap: policies trained open-loop on temporally independent samples must operate in clos...

Speaking of RL, Nvidia also just published a survey on the importance of closed-loop training (RL, etc.) in E2E driving.

research.nvidia.com/publication/...

09.12.2025 19:29 👍 9 🔁 2 💬 1 📌 0

Demonstrably Safe AI For Autonomous Driving Autonomous driving is the ultimate challenge for AI in the physical world. At Waymo, we’re solving it by prioritizing demonstrably safe AI, where safety is central to how we engineer our models and AI...

waymo.com/blog/2025/12...

Waymo is training End-to-End driving models with RL in simulation.

09.12.2025 17:17 👍 24 🔁 6 💬 2 📌 2

😂

08.12.2025 15:05 👍 1 🔁 0 💬 0 📌 0

Tired Europe: Let's do tons of AI regulations
Wired Europe: Let's do tons of AI open source
#aiPULSE2025

05.12.2025 20:00 👍 10 🔁 1 💬 0 📌 0

This essay, roughly on dual use, has been haunting me for a while now:
dl.acm.org/doi/pdf/10.1...

03.12.2025 08:06 👍 28 🔁 3 💬 3 📌 0

Excited to be at #Neurips2025 this week to present our paper "Monoculture or Multiplicity: Which is it?", joint work with Moritz Hardt.

📄 Paper #1000: openreview.net/pdf?id=DO5Lt...
📍 Wed, Dec 3, 2025 • 4:30 PM – 7:30 PM

Feel free to come by and reach out!

A short 🧵.

02.12.2025 15:55 👍 16 🔁 4 💬 1 📌 0

Attending #Neurips2025? Get your personalized Scholar Inbox conference program now to easily navigate the poster sessions and find what you are looking for:
www.scholar-inbox.com/conference/n...

02.12.2025 06:37 👍 34 🔁 12 💬 0 📌 0

Scholar Inbox for NeurIPS is live now.

01.12.2025 19:44 👍 14 🔁 5 💬 0 📌 2

Preprint site arXiv is banning computer-science reviews: here’s why The repository is taking steps to tackle a surge in low quality, AI-generated content.

www.nature.com/articles/d41...

ArXiv banned surveys due to AI slop spam.
Now we need to wait for them to be peer-reviewed.
Bad development, we need to find better solutions to AI slop than banning unreviewed papers.
Getting a survey reviewed at a good journal can take over a year. :(

01.12.2025 14:36 👍 0 🔁 0 💬 0 📌 0

Quick reminder about the EPFL PhD program deadline (EDIC) on Dec 15.

27.11.2025 10:14 👍 4 🔁 2 💬 0 📌 0

no this work focusses on IL.
I would personally be interested whether RL models habe similar failures but it is much harder to do this type of analysis when the model predicts actions not waypoints. (Can't do it offline anymore)

26.11.2025 11:18 👍 0 🔁 0 💬 0 📌 0

🚀 Introducing TMLR Beyond PDF!

🎬 This is a new, HTML-based submission format for TMLR, that supports interactive figures and videos, along with the usual LaTeX and images.

🎉 Thanks to TMLR Editors in Chief: Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!

25.11.2025 16:11 👍 75 🔁 22 💬 1 📌 3

Congratulations to @cworthy.org on their announcement today!

Learn more about this wonderful FRO here: www.youtube.com/watch?v=DA-e...

24.11.2025 19:23 👍 4 🔁 1 💬 0 📌 0

Apply - Interfolio {{$ctrl.$state.data.pageTitle}} - Apply - Interfolio

Come be our colleague in the robotics and embodied intelligence center at NYU!
🔷 Professor in Robotics / Embodied AI (Open Rank)
apply.interfolio.com/176977
🔷 Faculty Fellow in Robotics / Embodied AI
apply.interfolio.com/177077

18.11.2025 20:01 👍 20 🔁 6 💬 1 📌 0

Bernhard Jaeger

Latest posts by Bernhard Jaeger @bernhard-jaeger