Adriano D'Alessandro's Avatar

Adriano D'Alessandro

@adrian-dalessandro

| Computer vision researcher | Computer science PhD candidate @ SFU | More: https://dalessandro.dev/ I like to count things and periodically I work on applications in plant agriculture + ecology. Follow for stale political hot takes. Free Palestine πŸ‡΅πŸ‡Έ

102
Followers
130
Following
308
Posts
19.11.2024
Joined
Posts Following

Latest posts by Adriano D'Alessandro @adrian-dalessandro

I would argue that the images in crowd counting datasets are high entropy! You have a variety of scales, ethnicities, ages, roles, actions, etc. (an Asian woman soldier is very semantically different from a South American boy playing soccer)

It's the labels that are low entropy (i.e. just person).

07.03.2026 19:34 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

question of doing good or bad with AI isn't one we ever get to make. No more than Oppenheimer got to decide how the bomb was used. Our society is built around capital accumulation and the ones with all the capital will press the capital accumulation button until they are the only ones left

07.03.2026 06:50 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

There's this old moral question: "If you could press a button that would give you a bunch of money, but it would cause someone you don’t know in a distant part of the world to die, would you do it?".

The problem with AI is that we are not the ones who get to decide if the button is pressed. The

07.03.2026 06:50 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

🚨 New paper out!
"VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes"
πŸ‘‰ arxiv.org/abs/2509.25339
We test 37 VLMs on 2,700+ VQA questions about dense scenes.
Findings: even top models fumble badlyβ€”<20% on the hardest split and key failure modes in counting, OCR & consistency.

01.10.2025 13:17 πŸ‘ 8 πŸ” 3 πŸ’¬ 1 πŸ“Œ 2

I've seen a few papers now evaluating counting performance in VLMs using novel datasets with a counting split. I'm curious why nobody uses crowd counting and few-shot datasets like JHU++, SHA, FSC147, REC8K, etc.

05.03.2026 15:54 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Sean "Spaghetti" Orr

03.03.2026 23:36 πŸ‘ 8 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

How does AI interact with culture?

We’re thrilled to have Dr. Maria Antoniak (@mariaa.bsky.social) join us at FGVC! Her interdisciplinary work on AI in the humanities brings a new perspective to our workshop: the fine-grained categorization of the intangible parts of culture. See you at #CVPR2026!

03.03.2026 19:31 πŸ‘ 22 πŸ” 7 πŸ’¬ 0 πŸ“Œ 2

Too much competition, which drivers down innovation. Everyone takes small incremental steps rather than chasing neat but risky ideas.

03.03.2026 08:58 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

🚨 Updated Deadline Alert!

We have extended the deadline for the Proceedings from Feb 27th to Mar 3rd.

#CVPR #CV #AI

27.02.2026 19:04 πŸ‘ 3 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0

πŸ˜’ real happy for ya

27.02.2026 10:04 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

diversity with FLUX is using image-to-image translation as an augmentation strategy. If you want diverse images of "cows", just generate images for random scenes, and use them as a structural prior. Here's an image from a paper I'm working on where we're generating structural hard negatives:

25.02.2026 19:14 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

especially in fine-grained settings. So there is a trade-off between the diversity of categories you can represent and the diversity for a single category.

2. They don't seem to investigate whether something as trivial as augmentation can solve these problems.

3. One hack I've used to get more

25.02.2026 19:14 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

A few things pop into mind when reading the results.

1. While recent models have lower distributional diversity in their output for a single category, they do also correctly generate significantly more categories. The FLUX models can correctly depict substantially more categories than SD1.5,

25.02.2026 19:14 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

I started noticing that more recent models did not exhibit this property, despite the outputs being more aesthetic. I figured there was a trade-off that was being made during training to prioritize object fidelity and identity over other properties in the prompt.

25.02.2026 02:17 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I had a paper at ECCV that investigated whether you could get object counting data out of text-to-image models by prompting them with "An image of {N} oranges" or similar. The output for early LDMs would always be a bit wrong, but if you averaged the count in enough of them, you would get N. But,

25.02.2026 02:17 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

🚨 Reminder: The Proceedings Track deadline is February 27.

Don't miss out on this chance to share your research 😁 We're very excited to see what you've been working on!

#CVPR2026 #AI #ML

24.02.2026 08:41 πŸ‘ 4 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

someone had maybe passed my paper around to them.

23.02.2026 09:22 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

It's for sure collusion, but I don't know who they are. This wasn't during CVPR, though. In the most recent example, they were indirect, but made it clear they had read my recently submitted paper (they mentioned things that were only in the submitted version and not on Arxiv). But it sounded like

23.02.2026 09:22 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

We are excited to announce that we are co-hosting the AnimalCLEF25 challenge with LifeCLEF at #CVPR2026! 🐾

Individual animal re-identification is a major challenge in conservation. Help us build evidence-based conservation tools that work!

Find out more: www.kaggle.com/competitions...

23.02.2026 01:48 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

reviewing my paper or know who is, and they are probing to see what papers I'm reviewing.

I'm rambling now, but all of this is to say, I've been working on a different paper for ECCV which is hopefully a bit more bulletproof. It sucks to abandon a paper, but alas.

22.02.2026 17:47 πŸ‘ 1 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

pool: "if I'm charitable to this paper, but a person reviewing my paper isn't charitable to me, I risk losing a potential spot, thus I must tank this paper to increase my odds". And I know people are trying to game the system because I've been getting anonymous e-mails from people who are either

22.02.2026 17:47 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

and also in the papers I'm reviewing, I'm seeing a lot of reviewers heatseak any weakness and kill off interesting and high quality papers for not being absolutely free of trade-offs. Part of that is also due to mandatory reviewing for authors, which creates a prisoner's dilemma in the reviewing

22.02.2026 17:47 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

that is simply being refused by the relatively small pool of reviewers in my niche topic. I think a lot of the computer vision researchers' lunch has been eaten by frontier labs, and a lot of areas have saturated. So the expectations from reviewers have jumped drastically. Within my own experience,

22.02.2026 17:47 πŸ‘ 1 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

I think I'm more-or-less done with this specific paper. It's been rejected mutiple times. Each time I've drastically improved it, retooled it, changed the methodology up. Each time the scores get even worse. I've lost a year of my life to it. There's something fundamental in the trade-off I proposed

22.02.2026 17:47 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Even if you take the point at face value, a human brain consumes something like 5 MWh of energy over 30 years. Training GPT-5 consumes somewhere in the neighborhood of 100,000 MWh of energy.

Using the energy needed to train GPT-5 JUST ONE TIME could power a single brain for 600,000 years.

22.02.2026 02:43 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Interesting! That's one of the biggest headaches I've run into when working with fine-grained category descriptions! The model overfits to "goose" and maybe a color adjective, and ignores everything else. I'll keep an eye out for your project page!

21.02.2026 19:18 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

the feet are under the water). You need to separate no feet as a criteria from the wrong color feet given a long descriptive text.

21.02.2026 19:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Interesting! I'm curious to see how you did it! I've been working on a related problem where you need to use long text to identify the attributes of fine-grained object categories that may or may not all be present. A Greylag Goose has pink feet but its feet might not always be visible (maybe

21.02.2026 19:00 πŸ‘ 1 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

competition), the reviews started getting significantly more hostile. And with the pace of frontier labs releasing everything models, the risk of a work being redundant is ever present. It really sucks the air out of the room.

21.02.2026 16:52 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

scooped in those windows. Already, this paper is now stale and there were similar competing methods put on arxiv last month. So now, I have to reframe the paper entirely.

I don't think it was so bad even a year ago. As soon as the CV conferences introduced mandatory reviewing for authors (i.e. the

21.02.2026 16:52 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0