We're very happy to share our S3OD (1. Scaling, 2. Synthetic, & 3. Salient Object Detection)! The paper has been accepted at #ICLR2026.
You can get the paper, demo, code, trained models, and dataset on the project page.
s3odproject.github.io
We're very happy to share our S3OD (1. Scaling, 2. Synthetic, & 3. Salient Object Detection)! The paper has been accepted at #ICLR2026.
You can get the paper, demo, code, trained models, and dataset on the project page.
s3odproject.github.io
2) Efficient training can be more important than an unlimited amount of visual pre-training data. Its controllable in a synthetic visual pre-training.
3) We should design training datasets that capture the essence of visual learning.
I've learned some important things from the FDSL project series:
1) Simple and procedural synthetic pre-training with automatic labels can rival the sophisticated pre-training with image datasets like JFT-300M and ImageNet-21k. (In the paper, Ours 83.8 vs. JFT-300M 84.1 on ImageNet-1k fine-tuning)
I published this paper,
"Pre-training Vision Transformer with Formula-driven Supervised Learning,"
after journal paper rejections. This work was actually completed three years ago, but it's worth publicly sharing with the academic community.
arxiv.org/abs/2206.091...
[#CVPR2026 Workshop] Excited to announce that our workshop "Visual General Intelligence (VGI): Vision Research Toward the AGI Era" has been accepted at CVPR 2026!
Please also check out the website & blog!
Website: cvpr2026-vgi-workshop.limitlab.xyz
Blog: hirokatsukataoka.medium.com/vision-resea...
Slides from my #BMVC2025 talk are now available!
hirokatsukataoka.net/temp/presen/...
This includes the following papers:
- Industrial Synthetic Segment Pre-training arxiv.org/abs/2505.13099
- S3OD: Towards Generalizable Salient Object Detection with Synthetic Data arxiv.org/abs/2510.21605
Released HanDyVQA, ego-centric QAs for fine-grained hand-object interaction with 11.1K QAs, 10.3K segmentation masks in 112 domains.
Even Gemini-2.5-Pro reaches 73% & 97% human score, revealing key issue in space-time task.
Project: masatate.github.io/HanDyVQA-pro...
We have publicly shared our "PowerCLIP," a method to align powersets of image sub-region with textual structures for precise image-text recognition.
Outperforms several SotA in zero-shot classification, retrieval, robustness, and compositional tasks!
arxiv.org/abs/2511.23170
[ #NeurIPS2025 Spotlight ] We're very excited to share our "Domain Unlearning," this is a collaboration between Irie Lab, TUS & AIST. Selectively removing domain-specific knowledge from trained models.
- Project: kodaikawamura.github.io/Domain_Unlea...
- Paper: arxiv.org/abs/2510.08132
We’ve released the ICCV 2025 Report!
hirokatsukataoka.net/temp/presen/...
Compiled during ICCV in collaboration with LIMIT.Lab, cvpaper.challenge, and Visual Geometry Group (VGG), this report offers meta insights into the trends and tendencies observed at this year’s conference.
#ICCV2025
[Workshop Paper; 5/5; 20 Oct 15:40 - 16:30] Masatoshi Tateno, Gido Kato, Kensho Hara, Hirokatsu Kataoka, Yoichi Sato, Takuma Yagi, HanDyVQA: A Video QA Benchmark for Fine-Grained Hand-Object Interaction Dynamics, ICCV 2025 Workshop on HANDS workshop hands-workshop.org/workshop2025...
[Workshop Paper; 4/5; 20 Oct 15:10 - 16:00] Jumpei Nakao, Yuto Shibata, Rintaro Yanagi, Masaru Isonuma, Hirokatsu Kataoka, Junichiro Mori, Ichiro Sakata, Synthetic Text-to-Image Pre-training through Fractals with Pseudo-Captions, Trustworthy FMs Workshop. t2fm-ws.github.io/T2FM-ICCV25/...
[Workshop Paper; 3/5; 20 Oct 10:45 - 12:15] Non-archival paper, ICCV 2025 Workshop on MMRAgI agent-intelligence.github.io/agent-intell...
[Workshop Paper; 2/5; 19 Oct 16:40 - 18:00] Shinichi Mae, Ryousuke Yamada, Hirokatsu Kataoka, Industrial Synthetic Segment Pre-training, ICCV 2025 Workshop on LIMIT Workshop (Invited Poster). arxiv.org/abs/2505.13099
[Workshop Paper; 1/5; 19 Oct 11:25 - 12:15] Misora Sugiyama, Hirokatsu Kataoka, Simple Visual Artifact Detection in Sora-Generated Videos, ICCV 2025 Workshop on Workshop on Human-Interactive Generation and Editing, 2025. arxiv.org/abs/2504.21334 / higen-2025.github.io
[Main Conference Paper; 2/2; 22 Oct 10:45 - 12:45; Poster #451] Risa Shinoda, Nakamasa Inoue, Iro Laina, Christian Rupprecht, Hirokatsu Kataoka, AnimalClue: Recognizing Animals by their Traces, ICCV 2025 (Highlight). dahlian00.github.io/AnimalCluePa...
[Main Conference Paper; 1/2; 21 Oct 15:00 - 17:00; Poster #246] Risa Shinoda, Nakamasa Inoue, Hirokatsu Kataoka, Masaki Onishi, Yoshitaka Ushiku, AgroBench: Vision-Language Model Benchmark in Agriculture, ICCV 2025. dahlian00.github.io/AgroBenchPage/
[Organizing Workshop; 2/2; 19 Oct 13:00 - 18:00] Representation Learning with Very Limited Resources: When Data, Modalities, Labels, and Computing Resources are Scarce (LIMIT Workshop) iccv2025-limit-workshop.limitlab.xyz
[Organizing Workshop; 1/2; 19 Oct AM 9:00 - 12:30] Foundation Data for Industrial Tech Transfer (FOUND Workshop) iccv2025-found-workshop.limitlab.xyz
I’m planning to attend ICCV 2025 in person!
Here are my accepted papers and roles at this year’s #ICCV2025 / @iccv.bsky.social .
Please check out the threads below:
We organized the "Cambridge Computer Vision Workshop" at the University of Cambridge together with Elliott Wu, Yoshihiro Fukuhara, and LIMIT.Lab! It was a fantastic workshop featuring presentations, networking, and discussions.
cambridgecv-workshop-2025sep.limitlab.xyz
Finally, the accepted papers at #ICCV2025 / @iccv.bsky.social LIMIT Workshop has been publicly released!
--
- OpenReview: openreview.net/group?id=the...
- Website: iccv2025-limit-workshop.limitlab.xyz
At ICCV 2025, I am organizing two workshops: the LIMIT Workshop and the FOUND Workshop.
◆ LIMIT Workshop (19 Oct, PM): iccv2025-limit-workshop.limitlab.xyz
◆ FOUND Workshop (19 Oct, AM): iccv2025-found-workshop.limitlab.xyz
We warmly invite you to attend at these workshops in ICCV 2025 Hawaii!
I’m thrilled to announce my invited talk at BMVC 2025 Smart Cameras for Smarter Autonomous Vehicles and Robots!
supercamerai.github.io
Our AnimalClue has been accepted to #ICCV2025 as a highlight🎉🎉🎉 We also released an official press release from AIST!! This is the collaboration between AIST x Oxford VGG.
Project page: dahlian00.github.io/AnimalCluePa...
Dataset: huggingface.co/risashinoda
Press: www.aist.go.jp/aist_j/press...
Our AgroBench has been accepted to #ICCV2025 🎉🎉🎉 We released project page, paper, code, and dataset!!
Project page: dahlian00.github.io/AgroBenchPage/
Paper: arxiv.org/abs/2507.20519
Code: huggingface.co/datasets/ris...
Dataset: github.com/dahlian00/Ag...
We’ve released the CVPR 2025 Report!
hirokatsukataoka.net/temp/presen/...
Compiled during CVPR in collaboration with LIMIT.Lab, cvpaper.challenge, and Visual Geometry Group (VGG), this report offers meta insights into the trends and tendencies observed at this year’s conference.
#CVPR2025
For the research community, we’ve named it “http://LIMIT.Community.” If you’re interested, please feel free to contact us. Students are also welcome.
LIMIT.Lab brings together computer vision researchers from Japan, UK, Germany, and Netherlands! Below are our current partner institutions:
🇯🇵 AIST, Science Tokyo, TUS
🇬🇧 Oxford VGG, Cambridge
🇩🇪 UTN FunAI Lab
🇳🇱 UvA
# Fields & partner institutions are continually expanding