Chonghao Sima's Avatar

Chonghao Sima

@chonghaosima

Ph.D. student at HKU. Researcher on computer vision, autonomous driving and robotics (starter). Hobbyist on hiking, j-pop, scenery photography and anime.

64
Followers
189
Following
22
Posts
02.12.2024
Joined
Posts Following

Latest posts by Chonghao Sima @chonghaosima

๐Ÿ† Looking ahead: We are working on hosting a competition in 2026. We want to see different policies and hardware setups compete head-to-head in the same arena. Let's put them to the real test!

11.02.2026 16:53 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Weโ€™ve also released some "Quality of Life" tools to streamline your workflow: โšก Ultra-fast compute norm state (Significantly faster than the official LeRobot implementation!) ๐Ÿ› ๏ธ Micro-tools for LeRobot dataset manipulation ๐ŸŽฎ DAgger support

11.02.2026 16:52 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Our repo includes code, data, hardware manuals, and inference setups for AgileX (Songling) and Ark (rolling out gradually). We really hope to bring the reproducibility standards of the CV community into this space. ๐Ÿ”ง

11.02.2026 16:52 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Huge shoutout to the entire team and everyone involved behind the scenes! ๐Ÿ™Œ

KAI0 is going fully open-source this week. ๐Ÿš€

๐Ÿ“„ Paper: arxiv.org/abs/2602.09021 ๐Ÿ’ป Code: github.com/OpenDriveLab...

11.02.2026 16:51 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 3 ๐Ÿ“Œ 0

[5/5] Bottom Line

โ€ข Not all robot data is equally valuable
โ€ข Fast iteration > bruteforce scaling
โ€ข Weight-space merging can outperform joint training
โ€ข Stage-aware advantage estimation helps long-horizon tasks

๐Ÿ“„ Full report: Q1 2026
๐Ÿ“ฆ Data + checkpoints + challenge: 2026

24.12.2025 09:25 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

[4/5] Problem: Long-Horizon Credit Assignment

6-minute tasks. Which actions actually helped?

Solution โ†’ Stage Advantage:
โ€ข Decompose into semantic stages
โ€ข Predict advantage directly (not value-diff)
โ€ข Smoother supervision, less error compounding

24.12.2025 09:25 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Video thumbnail

[3/5] Problem: Expensive Iteration

Collect new data โ†’ Retrain everything โ†’ Repeat

Slow yet expensive.

How? Model Arithmetic:
โ€ข Train only on new data
โ€ข Merge via weight interpolation
โ€ข Merged model > full-dataset model

Models trained separately preserve distinct modes.

24.12.2025 09:25 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Video thumbnail

[2/5] Problem: Distribution Mismatch

Training data โ‰  Model behavior โ‰  Real-world execution

This gap causes failures.

Solution โ†’ Mode Consistency:
โ€ข DAgger for failure recovery
โ€ข Augmentation for coverage
โ€ข Inference smoothing for clean execution

24.12.2025 09:25 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Video thumbnail

๐Ÿงฅ Live-stream robotic teamwork that folds clothes. 6 clothes in 3 minutes straight.

ฯ‡โ‚€ = 20hrs data + 8 A100s + 3 key insights:
- Mode Consistency: align your distributions
- Model Arithmetic: merge, don't retrain
- Stage Advantage: pivot wisely

๐Ÿ”— mmlab.hk/research/kai0 checkout 3mins demo

24.12.2025 09:25 ๐Ÿ‘ 6 ๐Ÿ” 1 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1

@cvprconference.bsky.social

10.06.2025 00:29 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

๐Ÿš€ HERE WE GO! Join us at CVPRโ€ฏ2025 for a full-day tutorial: โ€œRoboticsโ€ฏ101: An Odyssey from a Vision Perspectiveโ€
๐Ÿ—“๏ธ Juneโ€ฏ12 โ€ข ๐Ÿ“ Roomโ€ฏ202B, Nashville

Meet our incredible lineup of speakers covering topics from agile robotics to safe physical AI at: opendrivelab.com/cvpr2025/tut...

#cvpr2025

10.06.2025 00:29 ๐Ÿ‘ 6 ๐Ÿ” 2 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Thanks for sharing! I will host the workshop for the whole day and welcome anyone who is struggling with current embodied AI trend to visit and chat and exchange ideas! We want to hear the opposite opinions from vision and robotics people on the topic of autonomy.

08.06.2025 03:08 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

When at @cvprconference.bsky.social a major challenge is how to split yourself for super amazing workshops.
I'm afraid to announce that w/ our workshop on "Embodied Intelligence for Autonomous Systems on the Horizon" we will make this choice even harder: opendrivelab.com/cvpr2025/wor... #cvpr2025

07.06.2025 21:19 ๐Ÿ‘ 14 ๐Ÿ” 4 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1

Wonderful end-to-end driving benchmark! We are getting **closer and closer** to **close-loop** evaluation in real world!

13.04.2025 16:07 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

@katrinrenz.bsky.social @kashyap7x.bsky.social @andreasgeiger.bsky.social @hongyang.bsky.social @opendrivelab.bsky.social

24.03.2025 16:23 ๐Ÿ‘ 2 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

DriveLM got 1k stars on GitHub, my first project reaching such milestone. Great thanks to all my collaborators who contribute much to this project, many thanks to the community who participate and contribute better insight upon this dataset, and wish this is not my end!

24.03.2025 16:23 ๐Ÿ‘ 10 ๐Ÿ” 2 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1

Fun fact: the second character in my last name is ๐ŸŽ as well.

17.03.2025 13:51 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Thanks for sharing! We long to know if we could improve e2e planner with limited but online data and compute, as performance with more training data seems plateau. However, online failure cases are unexplored as they couldnโ€™t directly contribute to the model performance via previous training scheme.

17.03.2025 13:51 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Random thoughts today: in humanoid research the methodology is basically decided by the final tasks/demo you would like to show off.

06.03.2025 08:06 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

๐ŸŒŸ Previewing the UniAD 2.0

๐Ÿš€ A milestone upgrade on the codebase of the #CVPR2023 best paper UniAD.

๐Ÿ‘‰ Check out this branch github.com/OpenDriveLab..., and we will get you more details soon

05.03.2025 11:54 ๐Ÿ‘ 9 ๐Ÿ” 3 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image Post image

๐Ÿš€ This year, weโ€™re bringing you three thrilling tracks in Embodied AI and Autonomous Driving, with a total prize pool of $100,000! Now get ready and join the competition!

Visit the challenge website: opendrivelab.com/challenge2025
And more on #CVPR2025: opendrivelab.com/cvpr2025

03.03.2025 11:44 ๐Ÿ‘ 5 ๐Ÿ” 4 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Thanks for all the staff who work hard to make it happen! Love to hear your feedback.

30.12.2024 11:40 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

For 1, We may need a "greatest common divisor" among tasks/algorithms/embodiments.
For 2, retargetting seems to be the most critical issue.
For 3, should we follow sample-efficiency RL or VLM-based e2e methods?

11.12.2024 09:32 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Random thoughts (again) on:

1. Benchmark & Evaluation & Metrics
2. Data collection (especially tele-op)
3. Policy network architecture & training receipt.

11.12.2024 09:32 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Random thoughts today: situation in humanoids today is similar to autonomous driving back into 2020-ish. Different hardware setups, people more favor of RL-based planning and sim2real deployment, etc. Will humanoids get into a similar development curve like driving?

04.12.2024 10:45 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Video thumbnail

We implemented undo in @rerun.io by storing the viewer state in the same type of in-memory database we use for the recorded data. Have a look (sound on!)

02.12.2024 15:51 ๐Ÿ‘ 47 ๐Ÿ” 10 ๐Ÿ’ฌ 3 ๐Ÿ“Œ 1