๐ Looking ahead: We are working on hosting a competition in 2026. We want to see different policies and hardware setups compete head-to-head in the same arena. Let's put them to the real test!
๐ Looking ahead: We are working on hosting a competition in 2026. We want to see different policies and hardware setups compete head-to-head in the same arena. Let's put them to the real test!
Weโve also released some "Quality of Life" tools to streamline your workflow: โก Ultra-fast compute norm state (Significantly faster than the official LeRobot implementation!) ๐ ๏ธ Micro-tools for LeRobot dataset manipulation ๐ฎ DAgger support
Our repo includes code, data, hardware manuals, and inference setups for AgileX (Songling) and Ark (rolling out gradually). We really hope to bring the reproducibility standards of the CV community into this space. ๐ง
Huge shoutout to the entire team and everyone involved behind the scenes! ๐
KAI0 is going fully open-source this week. ๐
๐ Paper: arxiv.org/abs/2602.09021 ๐ป Code: github.com/OpenDriveLab...
[5/5] Bottom Line
โข Not all robot data is equally valuable
โข Fast iteration > bruteforce scaling
โข Weight-space merging can outperform joint training
โข Stage-aware advantage estimation helps long-horizon tasks
๐ Full report: Q1 2026
๐ฆ Data + checkpoints + challenge: 2026
[4/5] Problem: Long-Horizon Credit Assignment
6-minute tasks. Which actions actually helped?
Solution โ Stage Advantage:
โข Decompose into semantic stages
โข Predict advantage directly (not value-diff)
โข Smoother supervision, less error compounding
[3/5] Problem: Expensive Iteration
Collect new data โ Retrain everything โ Repeat
Slow yet expensive.
How? Model Arithmetic:
โข Train only on new data
โข Merge via weight interpolation
โข Merged model > full-dataset model
Models trained separately preserve distinct modes.
[2/5] Problem: Distribution Mismatch
Training data โ Model behavior โ Real-world execution
This gap causes failures.
Solution โ Mode Consistency:
โข DAgger for failure recovery
โข Augmentation for coverage
โข Inference smoothing for clean execution
๐งฅ Live-stream robotic teamwork that folds clothes. 6 clothes in 3 minutes straight.
ฯโ = 20hrs data + 8 A100s + 3 key insights:
- Mode Consistency: align your distributions
- Model Arithmetic: merge, don't retrain
- Stage Advantage: pivot wisely
๐ mmlab.hk/research/kai0 checkout 3mins demo
@cvprconference.bsky.social
๐ HERE WE GO! Join us at CVPRโฏ2025 for a full-day tutorial: โRoboticsโฏ101: An Odyssey from a Vision Perspectiveโ
๐๏ธ Juneโฏ12 โข ๐ Roomโฏ202B, Nashville
Meet our incredible lineup of speakers covering topics from agile robotics to safe physical AI at: opendrivelab.com/cvpr2025/tut...
#cvpr2025
Thanks for sharing! I will host the workshop for the whole day and welcome anyone who is struggling with current embodied AI trend to visit and chat and exchange ideas! We want to hear the opposite opinions from vision and robotics people on the topic of autonomy.
When at @cvprconference.bsky.social a major challenge is how to split yourself for super amazing workshops.
I'm afraid to announce that w/ our workshop on "Embodied Intelligence for Autonomous Systems on the Horizon" we will make this choice even harder: opendrivelab.com/cvpr2025/wor... #cvpr2025
Wonderful end-to-end driving benchmark! We are getting **closer and closer** to **close-loop** evaluation in real world!
@katrinrenz.bsky.social @kashyap7x.bsky.social @andreasgeiger.bsky.social @hongyang.bsky.social @opendrivelab.bsky.social
DriveLM got 1k stars on GitHub, my first project reaching such milestone. Great thanks to all my collaborators who contribute much to this project, many thanks to the community who participate and contribute better insight upon this dataset, and wish this is not my end!
Fun fact: the second character in my last name is ๐ as well.
Thanks for sharing! We long to know if we could improve e2e planner with limited but online data and compute, as performance with more training data seems plateau. However, online failure cases are unexplored as they couldnโt directly contribute to the model performance via previous training scheme.
Random thoughts today: in humanoid research the methodology is basically decided by the final tasks/demo you would like to show off.
๐ Previewing the UniAD 2.0
๐ A milestone upgrade on the codebase of the #CVPR2023 best paper UniAD.
๐ Check out this branch github.com/OpenDriveLab..., and we will get you more details soon
๐ This year, weโre bringing you three thrilling tracks in Embodied AI and Autonomous Driving, with a total prize pool of $100,000! Now get ready and join the competition!
Visit the challenge website: opendrivelab.com/challenge2025
And more on #CVPR2025: opendrivelab.com/cvpr2025
Thanks for all the staff who work hard to make it happen! Love to hear your feedback.
For 1, We may need a "greatest common divisor" among tasks/algorithms/embodiments.
For 2, retargetting seems to be the most critical issue.
For 3, should we follow sample-efficiency RL or VLM-based e2e methods?
Random thoughts (again) on:
1. Benchmark & Evaluation & Metrics
2. Data collection (especially tele-op)
3. Policy network architecture & training receipt.
Random thoughts today: situation in humanoids today is similar to autonomous driving back into 2020-ish. Different hardware setups, people more favor of RL-based planning and sim2real deployment, etc. Will humanoids get into a similar development curve like driving?
We implemented undo in @rerun.io by storing the viewer state in the same type of in-memory database we use for the recorded data. Have a look (sound on!)