really great work! nice to see some feedback control :)
@matteosaponati
I am a research scientist in Machine Learning and Neuroscience. I am fascinated by life and intelligence, and I like to study complex systems. I love to play music and dance. Postdoctoral Research Scientist @ ETH Zürich ↳ https://matteosaponati.github.io
really great work! nice to see some feedback control :)
uh! Very interesting, great work! nice to see that feedback control approaches are getting more famous :)
@melikapayvand.bsky.social
Take our short 5-min anonymous survey on the Neuromorphic field’s current state & future:
📋 tinyurl.com/3jkszrnr
🗓️ Open until May 12, 2025
Results will be shared openly and submitted for publication. Your input will help us understand how interdisciplinary trends are shaping the field.
How does our brain predict the future? Our review of predictive processing + research program is now on arXiv arxiv.org/abs/2504.09614
50+ neuroscientists distributed across the world worked together to create this unique community project.
🌟 Paper out in npj Unconventional Computing!
www.nature.com/articles/s44...
A system built with just a few neurons, yet able to solve a complex task — not by stacking layers or going deeper, but by embracing unconventional thinking.
This is neuromorphic to me!
I'm extremely proud of this work, which shows how using the physics of analog electronic circuits helps us understand learning and computational principles of cortical neural networks and build efficient neural processing systems that can complement and outperform AI accelerators in edge computing!
fantastic post, and tasty food for thoughts.
shamelessly adding here that many different types of STDP come about from minimizing a prediction of the future loss function with spikes :)
hopefully another case of successful predictions.
www.nature.com/articles/s41...
#preprint #machinelearning #transformers #selfattention #ml #deeplearning
7/ I would like to thank Pascal Sager for all the training, the writing, the discussion, and whatnot, Pau Vilimelis Aceituno for the hours spent on refining the math, Thilo Stadelmann and Benjamin Grewe for their great contribution and supervision, and all the people at INI.
cheers 💜
6/ TL;DR
- Self-attention matrices in Transformers show universal structural differences based on training.
- Bidirectional models → Symmetric self-attention
- Autoregressive models → Directional, column-dominant
- Using symmetry as an inductive bias improves training.
⬇️
5/ Finally, we leveraged symmetry to improve Transformer training.
- Initializing self-attention matrices symmetrically improves training efficiency for bidirectional models, leading to faster convergence.
This suggest that imposing structures at initialization can enhance training dynamics.
⬇️
4/ We validate our analysis empirically showing that these patterns consistently emerge different language models and input modalities such as text, vision, and audio models.
- ModernBERT, GPT, LLaMA3, Mistral, etc
- Text, vision, and audio models
- Different model sizes, and architectures
⬇️
3/ We demonstrate that the self-attention matrices behaves differently for different training objectives:
- Bidirectional training (BERT-style) induces symmetric self-attention structures.
- Autoregressive training (GPT-style) induces directional structures with column dominance.
⬇️
2/ Self-attention is the backbone of Transformer models, but how does training shape the internal structure of self-attention matrices?
We introduce a mathematical framework to study these matrices and uncover fundamental differences in how they are updated during gradient descent.
⬇️
1/ I am very excited to announce that our paper "The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training" is available on arXiv 💜
arxiv.org/abs/2502.10927
How is information encoded in self-attention matrices?How to interpret it?
⬇️
hey Bluesky world, I realized I didn't post anything here yet
I'll start with sharing a recent blog post on the workshop we organised at the last Bernstein Conference 2024. I hope you enjoy it 💃
Thank you @melikapayvand.bsky.social , Laura, and Ana for the feedback and suggestions 💜
Hey Dan! I would like to be added :)