If you are in Europe, applications for the 2026 Reinforcement Learning Summer School are now open!
Location: Milan
Date: June 3–12, 2026
Website: rlsummerschool.com
If you are in Europe, applications for the 2026 Reinforcement Learning Summer School are now open!
Location: Milan
Date: June 3–12, 2026
Website: rlsummerschool.com
folks, please don't submit LLM-generated PRs to open source projects. It makes no sense.
If the maintainers want to use an LLM to fix an issue, they can use Claude or whatnot directly. They don't need you as intermediary, that's just silly.
If they don't want to use LLMs, they have reasons.
Thanks to the MyST parser and rst-to-myst, you can easily convert your documentation to Markdown while still keeping all the features of Sphinx =)
github.com/DLR-RM/stabl...
bsky.app/profile/kahn...
The talk i gave about "Recent Advances in RL for Continuous Control" at CERN last year is now online =)
www.youtube.com/watch?v=Sb0d...
github.com/Stable-Basel...
contributions are welcomed =) (the issue is from 2020...)
mainly lack of time, clean and readable implementation (and benchmark/comparison to other algos)
reppo is in the updated version (in the references), PQL might be what you are looking for? (should be in the references too)
For mpo, i need to re-read the paper and try to implement and benchmark.
nice blog post about a humanoid robotics startup failure: ruixu.us/posts/six-th...
I also updated the slides recently for the RL Mannheim Workshop to include new SOTA algorithms from early 2026
araffin.github.io/slides/advan...
The talk i gave about "Recent Advances in RL for Continuous Control" at CERN last year is now online =)
www.youtube.com/watch?v=Sb0d...
To understand how our radio buttons work I need to understand two separate component libraries and hundreds of lines of React.
If you missed this post last week, it explains pretty well how modern frontend works these days. :/
https://paulmakeswebsite...
Q-value overestimation animation for my upcoming talk about "Recent Advances in RL for Continuous Control" at the Mannheim RL Workshop
This is something I talk about in my paper, where I suggest being explicit about {\gamma}_train (some methods use multiple gammas during training) and \gamma_eval.
One of my students is empirically investigating this and, as one would expect, it can have a huge impact.
arxiv.org/abs/2510.16175
Servo 0.0.4 showing new support for multiple windows
December in Servo…
🎤🧑🏫 FOSDEM talks next week!
🤹🪟 multiple windows
🪆🌐 HTTP proxy support
🔐🕵️ more SubtleCrypto algorithms
💽🗃️ new site data & network API
servo.org/blog/2026/01...
The export and preview menu, with the "PDF" section unfolded.
HTML preview & export now available in the web app! With HTML export, you can create a website from the same Typst file as your PDFs. This makes it easy to create documents that feel just as at home on the web as they do in print.
This network analyzer is very efficient and allows you to find interesting accounts, eg. people followed by lots of the people you follow (but not you).
bsky-follow-finder.theo.io
(Reposting this for folks who have joined Bsky more recently)
People wanted our Open Source Organizations starter pack to include many projects, so we decided to give them their own starter pack.
go.bsky.app/HvKFRKa
"uv is fast because of what it doesn’t do, not because of what language it’s written in"
Using AI coding for data analysis without personal programming skill fills me with dread.
Small errors in the code poisons results in ways that may not be visibly obvious.
LLMs are great when people verify outputs; the path to hell is when they don't.
Hi RL Enthusiasts!
RLC is coming to Montreal, Quebec, in the summer: Aug 16–19, 2026!
Call for Papers is up now:
Abstract: Mar 1 (AOE)
Submission: Mar 5 (AOE)
Excited to see what you’ve been up to - Submit your best work!
rl-conference.cc/callforpaper...
Please share widely!
Almost 5 years in the making... "Hyperparameter Optimization in Machine Learning" is finally out! 📘
We designed this monograph to be self-contained, covering: Grid, Random & Quasi-random search, Bayesian & Multi-fidelity optimization, Gradient-based methods, Meta-learning.
arxiv.org/abs/2410.22854
A practical introduction to (deep) RL, providing intuitions to understand the more recent algorithms (continued).
In this second post, I continue from DQN on to the Soft Actor-Critic (SAC) algorithm and its extensions.
araffin.github.io/post/rl103/
What a phenomenal talk by @jenson.org. He works in a very different slice of tech than I do, but his ethos toward developing tech deeply matches my own, and he articulates it so well.
I highly recommend watching it, regardless of whether you're interested in UX.
antonin has been cooking olala
bsky.app/profile/araf...
Make sure to read part I =) (aka RL102: from tabular RL to DQN)
araffin.github.io/post/rl102/
A practical introduction to (deep) RL, providing intuitions to understand the more recent algorithms (continued).
In this second post, I continue from DQN on to the Soft Actor-Critic (SAC) algorithm and its extensions.
araffin.github.io/post/rl103/
🚀 We just shipped v0.216.0!
Word-level diffing just landed. 🎉
It's been a night-and-day difference for us—seeing exactly what changed within each line.
Are there any plans to release the code, and if so, in what timeframe? (Same question for XQC: code coming soon™?)