Quick reminder for everyone grinding on their RLC 2026 papers, only ~3 weeks to go!
The submission site opens in just a few days (Feb 17).
Deadlines:
โณ March 1 (AoE): Abstract Submission
โณ March 5 (AoE): Full Paper Submission
Good luck with the final changes!
12.02.2026 17:45
๐ 7
๐ 2
๐ฌ 0
๐ 2
We're thrilled to share that the Call for Workshops for this year's @rl-conference.bsky.social is now live!
As Workshop co-chair (alongside the wonderful Raksha Kumaraswamy and @claireve.bsky.social) we are looking forward to seeing the proposals for workshops that we receive.
LINK IN NEXT POST
13.02.2026 21:50
๐ 11
๐ 5
๐ฌ 1
๐ 2
๐งต Accepted at @iclr-conf.bsky.social!
Target networks stabilize bootstrapping in RL ๐ก๏ธ
But induce slow-moving targets ๐ข
Online networks adapt fast โก
But can diverge with function approximation ๐ฅ
๐ ๐๐ก๐ง๐ข ๐ฟ uses the online network ๐ผ๐ป๐น๐ ๐ถ๐ณ ๐ถ๐ ๐ฐ๐ฎ๐ป โ yielding faster ๐ข๐ฏ๐ฅ more stable RL.
Hereโs how ๐
11.02.2026 17:02
๐ 10
๐ 3
๐ฌ 1
๐ 0
The Reinforcement Learning workshop at U Mannheim was a lot of fun and highly recommended if you are looking for an engaging exchange of ideas, thanks to the organizers: Leif Dรถring, @theo-vincent.bsky.social, @claireve.bsky.social, and Simon Weiรmann! www.wim.uni-mannheim.de/doering/conf...
08.02.2026 19:13
๐ 12
๐ 2
๐ฌ 0
๐ 0
9/9
Many thanks to my co-authors: @yogesh1q2w.bsky.social, Tim Faust, Abdullah Akgรผl, Yaniv Oren, Melih Kandemir, @jan-peters.bsky.social, and Carlo D'Eramo
and to the funding agencies: @ias-tudarmstadt.bsky.social @tuda.bsky.social, @dfki.bsky.social, and @hessianai.bsky.social
05.02.2026 16:37
๐ 5
๐ 0
๐ฌ 0
๐ 0
8/9
Does it work onย otherย settings?
YES, we also report results:
- with the IMPALA architecture๐ฆ
- on offline experimentsโ๏ธ
- on continuous control experiments with the Simba architecture (only on the poster)๐ค
๐๐arxiv.org/pdf/2506.04398
05.02.2026 16:37
๐ 4
๐ 0
๐ฌ 1
๐ 0
7/9
By enforcing the network to learn multiple Bellman backups in parallel, iS-DQN K>1 constructs richer features๐ช
05.02.2026 16:37
๐ 4
๐ 0
๐ฌ 1
๐ 0
6/9
By adding additional heads to learn the following Bellman backupsย (iS-DQN K>1), iS-QN improves performance while not significantly increasing the memory footprint๐
Note: we added a layer normalization to further increase stability.
05.02.2026 16:37
๐ 4
๐ 0
๐ฌ 1
๐ 0
5/9
Interestingly, the idea of sharing the last features (iS-DQN K=1) already reduces the performance gap betweenย target-free DQN (TF-DQN) and target-based DQN (TB-DQN) on 15 Atari games by a large margin.
05.02.2026 16:37
๐ 4
๐ 0
๐ฌ 1
๐ 0
4/9
Then, we can utilize the target-based literature to enhance training stability.
We enrich the classical TD loss with iterated Q-learning to increase the feedback on the shared layers by learning consecutive Bellman backups.
This leads to iterated Shared Q-Network (iS-QN)
05.02.2026 16:37
๐ 5
๐ 0
๐ฌ 1
๐ 0
3/9
Our main idea is to use the last linear layer of the online network as a target network and share the rest of the features with the online network.
This drastically reduces the memory footprint because only the last linear layer of the online network is stored as a copy.
05.02.2026 16:37
๐ 4
๐ 0
๐ฌ 1
๐ 0
2/9
Many recentย works have shown that removing the target network leads to a performance decrease๐
Even methods that have been initially introduced without a target network benefit from their reintegration๐
05.02.2026 16:37
๐ 4
๐ 0
๐ฌ 1
๐ 0
1/9
With function approximation, bootstrapping without using a target network often leads to training instabilities.
However, using a target network slows down reward propagation and doubles the memory footprint dedicated to Q-networks.
05.02.2026 16:37
๐ 4
๐ 0
๐ฌ 1
๐ 0
TL;DR: Instead of using a full copy of the online network, we use a copy of the last linear layer of the online networkย as the target network, sharing the other features with the online network๐ก
05.02.2026 16:37
๐ 4
๐ 0
๐ฌ 1
๐ 0
Should we use a target network in deep value-based RL?๐ค
The answer has always been YES or NO, as there are pros and cons.
@iclr-conf.bsky.social, I will present iS-QN, a method that lies in between this binary view, collecting the pros while reducing the cons๐
05.02.2026 16:37
๐ 21
๐ 4
๐ฌ 1
๐ 1
๐ฅณOur paper "Floating-Base Deep Lagrangian Networks (FeLaN)" has been accepted to #ICRA2026.
FeLaN: a grey-box approach for physically consistent SysID of floating-base robots (humanoids, quadrupeds).
๐ arxiv.org/abs/2510.17270
๐ป Soon!
๐ schulze18.github.io/felan_website/
03.02.2026 16:29
๐ 10
๐ 3
๐ฌ 1
๐ 0
Ahmed Hendawy, Henrik Metternich, Th\'eo Vincent, Mahdi Kallel, Jan Peters, Carlo D'Eramo: Use the Online Network If You Can: Towards Fast and Stable Reinforcement Learning https://arxiv.org/abs/2510.02590 https://arxiv.org/pdf/2510.02590 https://arxiv.org/html/2510.02590
06.10.2025 06:32
๐ 1
๐ 1
๐ฌ 0
๐ 0
๐ค Announcing the 3rd workshop on Reinforcement Learning in Mannheim ๐ค
We have an amazing lineup of speakers: @Mathieugeist, @gio_ramponi, Theresa Eimer, @SarahKeren_, @araffin2, @c_rothkopf, and @AdrienBolland
โฐ Friday 6th February
๐University of Mannheim
02.12.2025 11:45
๐ 22
๐ 10
๐ฌ 1
๐ 1
New #J2C Certification:
Iterated $Q$-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning
Thรฉo Vincent, Daniel Palenicek, Boris Belousov, Jan Peters, Carlo D'Eramo
https://openreview.net/forum?id=Lt2H8Bd8jF
#reinforcement #iterative #iterations
27.10.2025 08:23
๐ 2
๐ 1
๐ฌ 0
๐ 0
Theฬo Vincent - Optimizing the Learning Trajectory of Reinforcement Learning Agents
YouTube video by Cohere
If you could not attend, here is a recorded version of my talk: youtube.com/watch?v=RCA2... ๐ฝ๏ธ
19.09.2025 16:08
๐ 4
๐ 0
๐ฌ 0
๐ 0
As usual, @ewrl18.bsky.social was a wonderful experience.
I had the pleasure of presenting my research as a Contributed Talk ๐
Special thanks to the organizers for making it happen!
19.09.2025 16:08
๐ 8
๐ 2
๐ฌ 1
๐ 0
Looking forward to @rl-conference.bsky.social !
I will be presenting 4 posters. Feel free to come and exchange with me during the conference, at the Finding the Frame workshop, or at the Inductive Biases workshop๐
04.08.2025 14:58
๐ 2
๐ 0
๐ฌ 0
๐ 0
Theฬo Vincent - Optimizing the Learning Trajectory of Reinforcement Learning Agents
YouTube video by Cohere
Had an amazing time presenting my research @cohereforai.bsky.social yesterday ๐ค
In case you could not attend, feel free to check it out ๐
youtu.be/RCA22JWiiY8?...
19.07.2025 07:41
๐ 7
๐ 3
๐ฌ 0
๐ 0
๐ค Very excited to give a talk @cohereforai.bsky.social next week Friday ๐ค
I will be presenting the research I have been working on for the last 2 years with Carlo D'Eramo, @jan-peters.bsky.social, and many more collaborators!
11.07.2025 16:17
๐ 4
๐ 1
๐ฌ 1
๐ 0
IAS is at RLDM 2025! We have many exiting works to share (see ๐), so come to our posters and talk to us!
12.06.2025 14:55
๐ 4
๐ 3
๐ฌ 4
๐ 0
Sparse network -> sparse poster
I will be presenting Eau De Q-Network today @rldmdublin2025.bsky.social Feel free to come and exchange at Poster #28 ๐ค
bsky.app/profile/theo...
12.06.2025 13:39
๐ 1
๐ 0
๐ฌ 0
๐ 0
Excited to present our latest work at RLDM 2025! If youโre curious about tactile sensing, active perception, or RL in robotics, stop by my poster. Hereโs what weโve been up to:
๐งต
#Robotics #TactileSensing #ReinforcementLearning #Transformers #ActivePerception @ias-tudarmstadt.bsky.social
12.06.2025 12:33
๐ 9
๐ 2
๐ฌ 1
๐ 1
It was amazing to work on this project with Tim Faust, Yogesh Tripathi, @jan-peters.bsky.social, and Carlo D'Eramo!
Thanks to the funding agencies @ias-tudarmstadt.bsky.social, @cs-tudarmstadt.bsky.social, @dfki.bsky.social, @hessianai.bsky.social, and @uni-wuerzburg.de๐ฆ
09.06.2025 14:56
๐ 2
๐ 0
๐ฌ 0
๐ 0
Very excited to present ๐Eau De Q-Network๐ on Thursday @rldmdublin2025.bsky.social Poster #28
๐Eau De Q-Network gradually prunes the network weights at the agent's learning pace, ultimately reaching a final sparsity level that is discovered by the algorithm!๐
๐๐ฐ arxiv.org/pdf/2503.01437
09.06.2025 14:54
๐ 11
๐ 2
๐ฌ 1
๐ 2