We have open-sourced the code. Try out Point Policy on your robots!
Project page: point-policy.github.io
Arxiv: arxiv.org/abs/2502.20391
Code:
We have open-sourced the code. Try out Point Policy on your robots!
Project page: point-policy.github.io
Arxiv: arxiv.org/abs/2502.20391
Code:
Further, reasoning about key points instead of raw pixels allows Point Policy to generalize to novel object instances and exhibit robustness to heavy scene variations, all while requiring at most 30 demonstrations per task.
Despite having no access to robot demonstrations, Point Policy exhibits an 88% success rate across 8 real-world tasks, a 75% improvement over baselines.
Point Policy uses sparse key points to represent both human demonstrators and robots, bridging the morphology gap. The scene is encoded through semantically meaningful key points from minimal human annotations.
The most frustrating part of imitation learning is collecting huge amounts of teleop data. But why teleop robots when robots can learn by watching us?
Introducing Point Policy, a novel framework that enables robots to learn from human videos without any teleop, sim2real, or RL.
We just released AnySense, an iPhone app for effortless data acquisition and streaming for robotics. We leverage Appleβs development frameworks to record and stream:
1. RGBD + Pose data
2. Audio from the mic or custom contact microphones
3. Seamless Bluetooth integration for external sensors
Can we extend the power of world models beyond just online model-based learning? Absolutely!
We believe the true potential of world models lies in enabling agents to reason at test time.
Introducing DINO-WM: World Models on Pre-trained Visual Features for Zero-shot Planning.
BAKU is fully open source and surprisingly effective. We found it easily adaptable for a host of visuotactile tasks in visuoskin.github.io
I will be presenting BAKU at the #NeurIPS2024 poster session on Thursday, December 12, from 11 a.m. to 2 p.m. PST at East Exhibit Hall A-C #4206!
Do drop in to chat about efficient robot policy architectures as well as some of the more recent work using BAKU.
P3-PO is a great example of how simple human priors can facilitate significantly better generalizability for robot policies.
All our code and task rollouts have been made public at: point-priors.github.io
Arxiv: arxiv.org/abs/2412.06784
Do try it out on your robots!
Turns out that replacing images with keypoint-based representations can enable enhanced generalization across spatial positions and orientations and novel object instances! We just released P3-PO, a method for learning generalizable policies with minimal data. π
Modern policy architectures are unnecessarily complex. In our #NeurIPS2024 project called BAKU, we focus on what really matters for good policy learning.
BAKU is modular, language-conditioned, compatible with multiple sensor streams & action multi-modality, and importantly fully open-source!