Also not excited for the first occurrence of this on OpenReview
Also not excited for the first occurrence of this on OpenReview
Prime example (found on the other platform) of why we should be careful with reward specification / alignment / guardrails / <enter your favorite AI safety topic here>.
How much of this is human guided, and how much is just optimizing the βget PR mergedβ reward?
github.com/matplotlib/m...
Take a look at our new paper!
We improve sample efficiency and performance in off-policy RL by prioritizing experience with the semantic knowledge of a pre-trained VLM, and not even a very large one ππ€ππ
Glad for the opportunity to work with @eladsharony.bsky.social and @tomjur.bsky.social !
Are there any good robotics and/or RL podcasts still running in 2025?
I used to enjoy The Robot Brains by Pieter Abbeel and TalkRL by @robinchauhan.bsky.social , but open to different styles too!
I find this idea really neat - VLMs are great at describing scenes, but LLMs are better reasoners, so let's use text as an interim representation.
Kind of reminiscent of the bitter lesson, only on a more "local" scale
arxiv.org/abs/2503.15108
Check out our new #ICLR2025 paper: EC-Diffuser leverages a novel Transformer-based diffusion denoiser to learn goal-conditioned multi-object manipulation policy from pixels!π
Paper: www.arxiv.org/abs/2412.18907
Project page: sites.google.com/view/ec-diff...
Code: github.com/carl-qi/EC-D...
Also probably an issue of salience bias - you hear about virtually every plane crash and a lot of shootings, but road fatalities rarely make the news.
If interested on our take on addressing inverse RL in large state spaces, go to meet @filippo_lazzati and @alberto_metelli in the poster session 5 #NeurIPS2024 today (paper -> arxiv.org/abs/2406.03812)
That speaks to the lack of good, standardized benchmarks for RL, more than anything else.
(Disclaimer: havenβt read the papers yet)
I agree completely. I just think the challenge will remain policy and public perception regarding public transit, same as it is today - just amplified by the effort that's been put into the technology by car manufacturers.
This is actually something that worries me - how can we ensure that all the progress in autonomous driving doesn't just put a lot more single-person cars on the road? And also, how do we convince people that this isn't an alternative, and that we need to keep investing in transit.
Want to learn / teach RL? β¨
Check out new book draft:
Reinforcement Learning - Foundationsβ¨sites.google.com/view/rlfound...
W/ Shie Mannor & Yishay Mansour
This is a rigorous first course in RL, based on our teaching at TAU CS and Technion ECE.
Been thinking about building a replacement for the arXiv daily email for a while, this looks like it might save me the trouble :)
Just out of curiosity: whatβs the action space here?
Letβs use the real data to improve the simulators and get better massive, procedurally generated data π€©
Some papers really feel like a glimpse into the future!
This one also serves as a powerful reminder that a lot of what we're focused on in the AI + robotics space is constrained by the hardware we have.
arxiv.org/abs/2411.11192