RaySt3R was accepted to NeurIPS! Check out the HuggingFace demo for image to 3D in cluttered scenes huggingface.co/spaces/bartd...
RaySt3R was accepted to NeurIPS! Check out the HuggingFace demo for image to 3D in cluttered scenes huggingface.co/spaces/bartd...
In "hearing the slide"π (led by @yuemin-mao.bsky.social ) we estimate *loss* of contact with a contact microphone, and use it to learn dynamic constraints.β‘ It allows moving multiple intricate objectsπ· efficiently, even objects that would otherwise be hard to grasp. fast-non-prehensile.github.io
For which the code is also available github.com/naver/pow3r
Thanks Christian for the advertisement.
github link: github.com/naver/dune
π Project Website: rayst3r.github.io
π arXiv: arxiv.org/abs/2506.05285
π Code: github.com/Duisterhof/...
π€ HF Demo: Coming (very) soon!
@CMU_Robotics @SCSatCMU @nvidia @NVIDIAAI @NVIDIARobotics
Big thanks to the awesome contributors to this project!π Jan Oberst, @bowenwen_me, @BirchfieldStan, @RamananDeva and @jeff_ichnowski. Also thanks to OctMAE author @s1wase, @nvidia for sponsoring compute π₯οΈ, and the scientists at @naverlabseurope for the inspiration! π§ββοΈ
We also study the impact of the confidence threshold on reconstruction quality. Our ablations suggest setting a higher confidence threshold improves accuracy, while limiting completeness and edge-bleeding. Users can tune the threshold for application-specific requirements ποΈ.
We evaluate RaySt3R against the baselines on synthetic and real-world datasets. The results suggest RaySt3R achieves zero-shot generalization to the real world, and outperforms all baselines by up to 44% in 3D chamfer distance π.
We train RaySt3R by curating a new dataset, for a total of 12 million views π· with Objaverse and GSO objects. The ablations π suggest that more and more diverse data improves RaySt3R's performance. RaySt3R does not require GT meshes, paving the way for training on real-world data.
π‘ Our key insight is that 3D object shape completion can be recasted as a novel-view synthesis problem. RaySt3R takes a masked RGB-D image as input, and predicts depth maps and object masks for novel views. We query multiple views and merge the predictions into a consistent point cloud.
We focus on multi-object 3D shape completion for robotics. Robots are commonly equipped with a RGB-D camera π·, but their measurements are noisy and incomplete.
Using only DINOv2 features π¦ as pretraining, we train a new model (RaySt3R) to produce accurate geometry.
Imagine if robots could fill in the blanks in cluttered scenes.
β¨ Enter RaySt3R: a single masked RGB-D image in, complete 3D out.
It infers depth, object masks, and confidence for novel views, and merges the predictions into a single point cloud. rayst3r.github.io
Do you think Europe will take the opportunity? The Netherlands is even cutting research funds under the new administration... It feels like there are still significantly more opportunities in the US.
Thanks Chris! This was a push with the entire dust3r team @naverlabseurope.bsky.social, congrats everyone!
The Best Student Paper Award goes to MASt3R-SfM! #3DV2025
πExcited to share that our paper was a finalist for best paper at #HRI2025! We introduce MOE-Hair, a soft robot system for hair care ππ»ππΌ that uses mechanical compliance and visual force sensing for safe, comfortable interaction. Check our work: moehair.github.io @cmurobotics.bsky.social π§΅1/7
MUSt3R: Multi-view Network for Stereo 3D Reconstruction
Yohann Cabon, Lucas Stoffl, Leonid Antsfeld, Gabriela Csurka, Boris Chidlovskii, Jerome Revaud, @vincentleroy.bsky.social
tl;dr: make DUSt3R symmetric and iterative+multi-layer memory mechanism->multi-view DUSt3R
arxiv.org/abs/2503.01661
Great news, CMU's Center for Machine Learning and Health (CMLH) decided to fund another year of our research! If you're a PhD student at CMU, consider applying for the next iterations of the fellowship - the funding is generous and relatively unconstrained :)
π
Is the book just as good/better than the show for "The 3 body problem"?
Watch Professor Jeff Ichnowski's RI seminar talk: "Learning for Dynamic Robot Manipulation of Deformable and Transparent Objects" π¦Ύπ€
@jeff-ichnowski.bsky.social closed out our Fall seminar series. Keep an eye out for the Spring schedule in the new year!
www.youtube.com/watch?v=DvvF...
Intro Post
Hello World!
I'm a 2nd year Robotics PhD student at CMU, working on distributed dexterous manipulation, accessible soft robots and sensors, sample efficient robot learning, and causal inference.
Here are my cute robots:
PS: Videos are old and sped up. They move slower in real-world :3
My growing list of #computervision researchers on Bsky.
Missed you? Let me know.
go.bsky.app/M7HGC3Y
My advisor @jeff-ichnowski.bsky.social! For example: github.com/BerkeleyAuto...
For international students: renewing your visa asap might be a good idea.
My lab mate @yuemin-mao.bsky.social :)
Welcome to all new arrivals here on Bluesky! :) Here's a starter pack of people working on computer vision.
go.bsky.app/PkAKJu5
After my general computer vision starter pack is now full (150/150 entries reached), here is one specific to 3D Vision: go.bsky.app/Cfm9XFe
Check out this work by my lab mates: learning dynamic tasks using a soft robotic hand!
Thank you for making the list! Could you add me as well? I work on vision for robot manipulation :)