Matteo Saponati's Avatar

Matteo Saponati

@matteosaponati

I am a research scientist in Machine Learning and Neuroscience. I am fascinated by life and intelligence, and I like to study complex systems. I love to play music and dance. Postdoctoral Research Scientist @ ETH Zürich ↳ https://matteosaponati.github.io

219
Followers
86
Following
15
Posts
02.12.2024
Joined
Posts Following

Latest posts by Matteo Saponati @matteosaponati

really great work! nice to see some feedback control :)

29.05.2025 10:52 👍 3 🔁 0 💬 0 📌 0

uh! Very interesting, great work! nice to see that feedback control approaches are getting more famous :)

29.05.2025 10:50 👍 1 🔁 0 💬 0 📌 0

@melikapayvand.bsky.social

16.04.2025 10:31 👍 0 🔁 0 💬 0 📌 0
Preview
Neuromorphic Questionnaire This form collects valuable information from the Neuromorphic Community as part of a project led by Matteo Saponati, Laura Kriener, Sebastian Billaudelle, Filippo Moro, and Melika Payvand. The goal is...

Take our short 5-min anonymous survey on the Neuromorphic field’s current state & future:

📋 tinyurl.com/3jkszrnr
🗓️ Open until May 12, 2025

Results will be shared openly and submitted for publication. Your input will help us understand how interdisciplinary trends are shaping the field.

16.04.2025 10:27 👍 8 🔁 10 💬 1 📌 0
Post image

How does our brain predict the future? Our review of predictive processing + research program is now on arXiv arxiv.org/abs/2504.09614
50+ neuroscientists distributed across the world worked together to create this unique community project.

15.04.2025 06:58 👍 85 🔁 31 💬 2 📌 11
Preview
A neuromorphic multi-scale approach for real-time heart rate and state detection - npj Unconventional Computing npj Unconventional Computing - A neuromorphic multi-scale approach for real-time heart rate and state detection

🌟 Paper out in npj Unconventional Computing!
www.nature.com/articles/s44...

A system built with just a few neurons, yet able to solve a complex task — not by stacking layers or going deeper, but by embracing unconventional thinking.

This is neuromorphic to me!

02.04.2025 15:32 👍 14 🔁 5 💬 1 📌 0

I'm extremely proud of this work, which shows how using the physics of analog electronic circuits helps us understand learning and computational principles of cortical neural networks and build efficient neural processing systems that can complement and outperform AI accelerators in edge computing!

02.04.2025 20:38 👍 9 🔁 4 💬 1 📌 0
Preview
Sequence anticipation and spike-timing-dependent plasticity emerge from a predictive learning rule - Nature Communications Prediction of future inputs is a key computational task for the brain. Here, the authors proposed a predictive learning rule in neurons that leads to anticipation and recall of inputs, and that reprod...

fantastic post, and tasty food for thoughts.

shamelessly adding here that many different types of STDP come about from minimizing a prediction of the future loss function with spikes :)

hopefully another case of successful predictions.

www.nature.com/articles/s41...

23.03.2025 09:49 👍 11 🔁 4 💬 0 📌 0

#preprint #machinelearning #transformers #selfattention #ml #deeplearning

18.02.2025 12:26 👍 1 🔁 0 💬 0 📌 0
Preview
a cartoon of two robots standing next to each other and the words `` bye '' . ALT: a cartoon of two robots standing next to each other and the words `` bye '' .

7/ I would like to thank Pascal Sager for all the training, the writing, the discussion, and whatnot, Pau Vilimelis Aceituno for the hours spent on refining the math, Thilo Stadelmann and Benjamin Grewe for their great contribution and supervision, and all the people at INI.

cheers 💜

18.02.2025 12:22 👍 2 🔁 0 💬 1 📌 0

6/ TL;DR

- Self-attention matrices in Transformers show universal structural differences based on training.
- Bidirectional models → Symmetric self-attention
- Autoregressive models → Directional, column-dominant
- Using symmetry as an inductive bias improves training.

⬇️

18.02.2025 12:22 👍 1 🔁 0 💬 1 📌 0
Post image

5/ Finally, we leveraged symmetry to improve Transformer training.

- Initializing self-attention matrices symmetrically improves training efficiency for bidirectional models, leading to faster convergence.

This suggest that imposing structures at initialization can enhance training dynamics.

⬇️

18.02.2025 12:22 👍 1 🔁 0 💬 1 📌 0
Post image

4/ We validate our analysis empirically showing that these patterns consistently emerge different language models and input modalities such as text, vision, and audio models.

- ModernBERT, GPT, LLaMA3, Mistral, etc
- Text, vision, and audio models
- Different model sizes, and architectures

⬇️

18.02.2025 12:22 👍 2 🔁 0 💬 1 📌 0

3/ We demonstrate that the self-attention matrices behaves differently for different training objectives:

- Bidirectional training (BERT-style) induces symmetric self-attention structures.
- Autoregressive training (GPT-style) induces directional structures with column dominance.

⬇️

18.02.2025 12:22 👍 1 🔁 0 💬 1 📌 0

2/ Self-attention is the backbone of Transformer models, but how does training shape the internal structure of self-attention matrices?

We introduce a mathematical framework to study these matrices and uncover fundamental differences in how they are updated during gradient descent.

⬇️

18.02.2025 12:22 👍 1 🔁 0 💬 1 📌 0
Preview
The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training Self-attention is essential to Transformer architectures, yet how information is embedded in the self-attention matrices and how different objective functions impact this process remains unclear. We p...

1/ I am very excited to announce that our paper "The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training" is available on arXiv 💜

arxiv.org/abs/2502.10927

How is information encoded in self-attention matrices?How to interpret it?

⬇️

18.02.2025 12:22 👍 8 🔁 3 💬 1 📌 0
It’s been a while since I last published something on my personal blog. Life has moved along in beautiful, unexpected ways, and I’ve experienced many absolutely lovely moments. On the professional side, I recently had the opportunity to organize a workshop at the Bernstein Conference in Frankfurt am Main (my lovely and chaotic Frankfurt <3). This workshop was brought to life thanks to the other incredible organizers, Laura Kriener and Melika Payvand, with their creativity, initiative, and vision. I feel grateful to be a part of this vibrant community. So, why not use this opportunity to write here again?

hey Bluesky world, I realized I didn't post anything here yet

I'll start with sharing a recent blog post on the workshop we organised at the last Bernstein Conference 2024. I hope you enjoy it 💃

Thank you @melikapayvand.bsky.social , Laura, and Ana for the feedback and suggestions 💜

23.01.2025 08:32 👍 15 🔁 2 💬 0 📌 1

Hey Dan! I would like to be added :)

07.01.2025 20:46 👍 1 🔁 0 💬 1 📌 0