Excited to share new work on how the brain makes social inferences from visual input! π§ π―ββοΈ
(With @lisik.bsky.social , @shariliu.bsky.social, @tianminshu.bsky.social , and Minjae Kim!) www.biorxiv.org/content/10.6...
Excited to share new work on how the brain makes social inferences from visual input! π§ π―ββοΈ
(With @lisik.bsky.social , @shariliu.bsky.social, @tianminshu.bsky.social , and Minjae Kim!) www.biorxiv.org/content/10.6...
excited to share some recent work!
neural networks trained on multi-view sensory data are the first to match human-level 3D shape perception
we predict human accuracy, error patterns, and reaction timeβall zero-shot, no training on experimental data
arxiv.org/abs/2602.17650
1/π§
I don't disagree with that point, but at the same time, you can think of this from another perspective: Isn't it crazy that despite the many complex nonlinear transformations implemented by seemingly different models, they nonetheless arrive at something that is similar up to a linear transform?
More to come. We are working on a paper now that characterizes these issues in more depth.
And Fig. 7 in this paper. journals.plos.org/ploscompbiol...
This number is based on what we have seen in analyses in my lab. Some examples are Fig. 5 of in this paper...
www.science.org/doi/10.1126/...
Second, if the only thing that differentiates two alternative models is a simple linear reweighing, it raises a question of how important their differences really are. It may be more informative in the end to focus on understanding what the models have in common than how they differ.
3. We have been thinking about this. The answer is not straightforward. First, RSA is effectively insensitive to anything beyond the first 5-10 PCs in brain and network representations, and I happen to think there is much more to the story than just a handful of dimensions. bsky.app/profile/mick...
1. Yes, trained networks are much better when using RSA. We show this in a supplementary analysis.
2. We have never computed this exact quantify. But we did show that if you do PCA on wide untrained networks, you can drastically reduce their dimensionality while still retaining their performance.
Although pre-trained networks can be super useful for comp neuro, the surprising success of untrained networks suggests that there may be still be much to learn by focusing on simpler approaches. We shouldn't be focusing all our attention on the latest DNN models coming out of the ML world.
These architectural manipulations were things that you wouldnβt typically think to try if your primary focus was on trained networks. We wrote about this in our discussion.
Importantly, one of the things we learned in that work was that the field hasnβt been giving untrained networks the best chance possible. We found that fairly simple architectural manipulations could dramatically improve their performance.
That's true. But untrained networks can do surprisingly well. In a recent paper, we found that untrained networks can rival trained networks in a key monkey dataset. In the human data we examined, there was still a gap relative to pre-trained models, as you point out. www.nature.com/articles/s42...
This paper was an awesome collaborative effort of a @fitngin.bsky.social working group. It provides a detailed review of how DNNs can be used to support dev neuro research
@lauriebayet.bsky.social and I wrote the network modeling section about how DNNs can be used to test developmental theories π§΅
Infants organise their visual world into categories at two-months-old! So happy to see these results published - congratulations Cliona and the rest of the FOUNDCOG team.
New paper from our lab on the behavioral significance of high-dimensional neural representations!
I have a PhD opening for my #VIDI BrainShorts project π½οΈπ§ π€! Are you or do you know an ambitious, recent (or almost) MSc graduate with a background in NeuroAI and interest in large-scale data collection and video perception? Check out our vacancy! (deadline Feb 15).
werkenbij.uva.nl/en/vacancies...
Wonderful article about our recent paper in @pnasnexus.org! Thanks, @sachapfeiffer.bsky.social and @mickbonner.bsky.social!
@yikai-tang.bsky.social @uoftpsychology.bsky.social @artsci.utoronto.ca @utoronto.ca
Our new paper in @sfnjournals.bsky.social shows different neural systems for integrating views into places--PPA integrates views *of* a location (e.g., views of a landmark), while RSC integrates views *from* a location (e.g., views of a panorama). Work by the bluesky-less Linfeng Tony Han.
Why isnβt modern AI built around principles from cognitive science or neuroscience? Starting a substack (infinitefaculty.substack.com/p/why-isnt-m...) by writing down my thoughts on that question: as part of a first series of posts giving my current thoughts on the relation between these fields. 1/3
Spread the word: I'm looking to hire a postdoc to explore the concept of attention (as studied in psych/neuro, not the transformer mechanism) in large Vision-Language Models. More details here: lindsay-lab.github.io/2025/12/08/p...
#MLSky #neurojobs #compneuro
As for what other inductive biases will prove to be important, this is still TBD. I think that wiring costs (e.g., topography) may be one.
But neuroscientists and AI engineers have different goals! Neuroscientists should be seeking parsimonious theories, not high-performing models.
Importantly, to get this to work, NeuroAI researchers have to back to the drawing board and search for simpler approaches. I think that currently, we are relying too much on the tools and models coming out of AI. It makes it seem like the only feasible approach is whatever currently works in AI.
The simple-local-learning goal is certainly non-trivial! But recent findings (especially universality of network representations) suggest that it has potential.
What might such a theory look like? My bet is that it will be one that combines strong architectural inductive biases with fully unsupervised learning algorithms that operate without the need for backpropagation. This is a very different direction than where AI and NeuroAI are currently headed.
Although the deep learning revolution in vision science started with task-based optimization, there are intriguing signs that a far more parsimonious computational theory of the visual hierarchy is attainable.
These universal representations are not restricted to early network layers. We see them across the full depth of the networks that we examined. Their strong universality and independence of task demands calls out for a parsimonious explanation that has yet to discovered.
A second paper from my lab adds another element to this story: after training, many diverse DNNs converge to universal features that are independent of the tasks they were trained on. It is these universal features that are most strongly shared with visual cortex. www.science.org/doi/10.1126/...
What does this mean? It suggests that architectural inductive biases alone can get us surprisingly far in explaining the image representations of the ventral stream. See a great commentary by @binxuwang.bsky.social wang.bsky.social and Carlos Ponce. www.nature.com/articles/s42...