🔍 Best discoveries of 2025:
🖊️ Comic: In — Phil McPhail
📚 Book: Mémoires d’une jeune fille rangée — Simone de Beauvoir
🎞️ Movie: Lost in Translation
📕 Manga: Vagabond
🔍 Best discoveries of 2025:
🖊️ Comic: In — Phil McPhail
📚 Book: Mémoires d’une jeune fille rangée — Simone de Beauvoir
🎞️ Movie: Lost in Translation
📕 Manga: Vagabond
✨ My 2025 best of:
📺 Series: Adolescence
🎬 Animated movie: K-Pop Demon Hunters
📖 Book: Onyx Storm — Rebecca Yarros
🎵 Music: ESTL
📄 Papers:
1️⃣ Flow Matching Policy Gradients
2️⃣ VGGT
3️⃣ Treillis
Check out our 2025 highlights in computer vision!
🚀Five new *St3R models (MASt3R-SfM, MUSt3R, PanSt3R, HAMSt3R, HOSt3R)
🤩Anny parametric 3D human model (Apache 2.0)
🤟Universal encoder for all-in-one vision FM
Watch the highlights 👇
More info ▶️ tinyurl.com/muvs5vnu
I will be in Hong Kong until Thursday for SIGGRAPH ASIA 25 \o/
DM if you want to meet!
Given N image generation jobs, can we do better than N calls to text-to-image ? @daledecatur.bsky.social proposes to share compute across a batch of jobs, achieving higher efficiency at similar quality.
Check out our #ICCV2025 poster #153 today during Poster Session #4 from 2:45-4:45 HST!
Likewise, i am a big fan !
Man-made objects are often repeated in urban scene. 🎳 Can we leverage these repetitions to improve 3D reconstruction 📷? Exploration led by the titan Nicolas Violante Grezzi 👇🧵
Great opportunity! This is a dream team, and they are located 20 minutes from Paris.
OpenAi Ghibli style + the new FramePack (ControlNet team). I am very impressed by this model, and it was super easy to run. Is it a commoditization moment for video GenAi?
Would anyone know the best current code for human keypoint estimation from a video of a single human?
Proposal: Reviewers who have not given any sign-of-life to the AC get an automatic flag on the rebuttal of the papers they submitted, to be considered at the discretion of the reviewers of those papers.
Best of 2024 ?
Movies : Perfect Days (runner-up Anora)
Series: three-body problem
Animated series: Arcane
Research paper: Dust3r
Manga: Oshi no ko
What about you?
From a few user clicks to 3D material segmentation - in seconds ⌛. It's exciting to see so many pieces in 3D generation and analysis starting to work reliably and fast ! Super nice work from Michael and team (mfischer-ucl.github.io)
This work was led by Amir Barda, in collaboration with Matheus
@gadelha.bsky.social
, Noam Aigerman
@noamiko.bsky.social
, Vova Kim @vovakim.bsky.social and Amit Bermano.
Check out our paper for more details 📜: arxiv.org/abs/2412.00518
7/end
In the end, with the editing tool becoming FAST 🐇 , 3D editing becomes really FUN to play with! 6/
Do we teach inpainting to a multiview backbone 🤔? Or do we teach multiview to an inpainting backbone? We show that the latter is much better. Multiview is easier to learn than inpainting. 4/
Now, all we need is a multiview inpainting model 😅. How do we train one? Data is always king. We know inpainting masks can’t be random; they need to be realistic and close to what users would do. We propose 3 strategies, in 3D, to create Objaverse masks, closely resembling what a user would do. 4/
However, SDS remains slow and brittle 🐢💥. Instead, we propose to cast the problem of 3D inpainting as 2D *multiview* inpainting 📸-📸-📸-📸. This is possible thanks to off-the-shelf pre-trained transformer models (LRM), which reconstruct multiview image back to meshes, Gsplats, and NeRFs. Great! 3/
There has been previous attempts to tackle generative mesh editing. Check out Amir Barda’s talk on MagicClay this Thursday at Siggraph Asia, Japan 🇯🇵, using SDS. 2/
🌟 Text-to-3D is awesome ! But how do we iterate on the generated 3D model, to get just the right result? Do we tweak the prompt endlessly? Revert to traditional 3D modeling techniques?
We propose a solution to "3D inpainting” 🤩🎨
Project: amirbarda.github.io/Instant3dit....
A thread. 🧵 1/