Iโll be presenting this at the 4:30 poster session today (Thursday) at #205.
Come say hi! ๐ข
Iโll be presenting this at the 4:30 poster session today (Thursday) at #205.
Come say hi! ๐ข
This work was a lot of fun and was done in collaboration with my advisors Antonio Torralba and @vincentsitzmann.bsky.social. Please see our paper for more details and results.
We hope this work inspires future research in this area!
And please share your favorite images from our gallery!!
6/6
CLIP
DINO-v2
EVA-02
MoCo-v3
We distill images with different models, each yielding distinct styles that hint as to how these models "see."
Please see our gallery to browse all our images from many datasets (incl. ImageNet, Stanford Dogs, CUB 200, Flowers-102, Food-101): linear-gradient-matching.github.io/gallery/
5/6
Sample of Stanford Dogs distilled with DINO-v2
Our method out-performs all real-image baselines on the standard ImageNet benchmarks and shines even brighter on datasets with fine grained classes!
The learned images seem to contain more discriminative features than any single real image, leading to a better classifier.
4/6
Diagram describing Linear Gradient Matching
We directly learn our synthetic images such that they induce similar gradients as the real data when training a linear classifier.
Our meta loss is simply the distance between these gradients!
Critically, we also parameterize our images as pyramids as a form of implicit regularization.
3/6
While prior work focuses on distilling images to train models from scratch, this task becomes unreasonable with extremely small support sizes.
Instead, we focus on learning images to train *linear classifiers* on top of pre-trained models, a more relevant task in the era of foundation models.
2/6
Happy to finally share our latest work on Dataset Distillation!
"Dataset Distillation for Pre-Trained Self-Supervised Vision Models," set to appear at #NeurIPS 2025!
We learn 1 image per class to train linear heads for pre-trained models.
linear-gradient-matching.github.io
More in thread ๐ฝ
Our afternoon session is starting soon at 1:30!
Be sure to stick around for our invited speakers, including Wei-Chiu Ma as a recent addition!
#ICCV2025 @iccv.bsky.social
We're accepting non-archival submissions to our #ICCV workshop until August 29th!
curateddata.github.io
Please send us all your data-centric ideas, including
in-progress works, accepted papers from the main conference, or those still under review elsewhere!
See you there! @iccv.bsky.social
Just one week left until the deadline for the *archival* track at our workshop!
If you don't wish to have your paper published in the proceedings, you still have until late August to submit to the non-archival track!
We'd love to see all your interesting data-based work, so please submit!!
(tagging @csprofkgd.bsky.social @iccv.bsky.social for visibility)
We also have a fantastic lineup of invited speakers, including @sarameghanbeery.bsky.social (MIT), Alyosha Efros (Berkley), Olga Russakovsky (Princeton), and Zhuang Liu (Princeton).
We're looking forward to a great workshop and hope to see all of you there! ๐ด
3/
We're accepting both short (4 page) and long-form (8 page) papers on any topic related to data curation.
The deadline for long papers wishing to be published in the ICCV workshop proceedings is July 7, and all other submissions have until August 29.
Please reply or DM me with any question!
2/
We are happy to announce the #ICCV2025 Workshop on Curated Data for Efficient Learning!
Our workshop focuses on all things involving the curation of datasets, including synthetic data generation, sample filtering, and dataset distillation.
curateddata.github.io
1/
@csprofkgd.bsky.social Suggested penalty for #CVPR2025
figures that aren't colorblind-friendly?