Figure showing translation symmetry in co-occurrence statistics & PCA of model representations match across theory, word2vec, and LLMs:
Figure showing translation symmetry in co-occurrence statistics & PCA of model representations match across theory, word2vec, and LLMs:
and thereby mediate correlations and constrain the geometry of representations. The robustness of this representational geometry should therefore be understood as a collective effect (!).
We had observed a similar robustness in our earlier work (arxiv.org/abs/2505.18651). In our new paper, this geometric recovery can be explained by extending our prior theory to one with a continuous latent variable. That is, many words in a vocabulary have a notion of e.g. 'time' or 'space' ...
This means important geometric information is hidden in co-occurrence between these words and other words in the vocabulary - for example, words with a notion of seasonality - that have some semantic overlap.
Surprisingly, the geometric information for a collection of words - for example, the 12 calendar months of the year - does not arise solely from co-occurrences within that group. One can ablate their contribution entirely and find that representations of the 12 months can still be recovered.
Neural representations can be used for decoding via linear probes (such as predicting spatial or temporal coordinates), and our theory, based on constraints from symmetry, predicts the efficiency of this decoding process, matching empirics.
That our theory carries over to LLM observations (despite lacking a theoretical handle here) demonstrates how symmetry in simple low-order statistics can have robust effects on representations.
Word embeddings there have Fourier PCA modes, and the geometry we obtain here is predictive of that found in LLM hidden layers, explaining & unifying prior observations with a single idea.
Translation symmetry in co-occurrence statistics & PCA of model representations match across theory, word2vec, and LLMs:
to a translation symmetry that can be seen empirically in the co-occurrence statistics of natural language (!). That is, the co-occurrence of words in such a collection (which semantically correspond to a collection of points on a lattice) depends only on the distance between them.
Prior work has found that LLM representations of certain collections of words (such as words corresponding to space, time, and color - among others) exhibit simple, regular structure in their PCA components. We show this arises in simple word embedding models (word2vec) as well, and trace it back...
In our new preprint, we explain how some salient features of representational geometry in language modeling originate from a single principle - translation symmetry in the statistics of data.
arxiv.org/abs/2602.150...
With Dhruva Karkada, Daniel Korchinski, Andres Nava, & Matthieu Wyart.
Dhruva Karkada, Daniel J. Korchinski, Andres Nava, Matthieu Wyart, Yasaman Bahri: Symmetry in language statistics shapes the geometry of model representations https://arxiv.org/abs/2602.15029 https://arxiv.org/pdf/2602.15029 https://arxiv.org/html/2602.15029
How do diverse context structures reshape representations in LLMs?
In our new work, we explore this via representational straightening. We found LLMs are like a Swiss Army knife: they select different computational mechanisms reflected in different representational structures. 1/
Congratulations!
Why isnβt modern AI built around principles from cognitive science or neuroscience? Starting a substack (infinitefaculty.substack.com/p/why-isnt-m...) by writing down my thoughts on that question: as part of a first series of posts giving my current thoughts on the relation between these fields. 1/3
...this work on Fri 12/5.
Surprisingly, there is great agreement with real language data (you can even see the Kronecker product structure in Wikipedia text!). As we found later, our theoretical model makes concrete some ideas put forth by the cognitive psychologist David Rumelhart. Daniel (lead author) will be presenting...
We propose a latent variable model that prescribes a particular (Kronecker product) structure for the co-occurrence probabilities of words. The eigendecomposition is analytically solvable and gives testable predictions for when, how, and why the ability to solve linear analogies emerges.
can complete analogies, we felt they did not satisfyingly address some stringent empirical tests.
In arxiv.org/abs/2505.18651, with Daniel Korchinski, Dhruva, and Matthieu Wyart, we propose a new theory.
The ability to do analogical reasoning with word vectors is perhaps the simplest example of an "emergent" ability, in the sense that nontrivial computational properties arise despite the loss not having been explicitly optimized for this task. While many works have tried to explain why word vectors
(with famous examples like "king is to queen as man is to woman"). Dhruva Karkada (lead author) will be presenting this work at NeurIPS on Thu 12/4.
of the co-occurrence statistics of words (a measure of two-point correlations).
Among other things, this means that the *complete eigendecomposition* (mode by mode) of co-occurrence probabilities of words is important for understanding why word vectors are able to complete simple analogies
In arxiv.org/abs/2502.09863, we show that a family of supervised loss functions, quartic in the learnable weights, capture the learning dynamics and semantic structure of word embedding models such as word2vec. This allows closed-form expressions for the full trajectory of learning in terms
I'll be missing NeurIPS this year, but we have two conference papers on the dynamics of learning and the structure of data in language modeling, a new direction I'm excited about: arxiv.org/abs/2502.09863 and arxiv.org/abs/2505.18651.
Very excited to lead this new @simonsfoundation.org collaboration on the physics of learning and neural computation to develop powerful tools from physics, math, CS, stats, neuro and more to elucidate the scientific principles underlying AI. See our website for more: www.physicsoflearning.org
I'll briefly touch on arxiv.org/abs/2502.09863 (with Dhruva, Jamie, and Michael) and then discuss arxiv.org/abs/2505.18651 (with Daniel, Dhruva, and Matthieu).
My talk is "On the emergence of linear structure in word embeddings" & will cover joint works with some fantastic collaborators: Dhruva Karkada, Jamie Simon, Michael DeWeese, Daniel Korchinski, & Matthieu Wyart. I'm excited about this line of work & hope you'll find it interesting!
I'm looking forward to giving a talk tomorrow morning at the ICML workshop on High-Dimensional Learning Dynamics (HiDL) sites.google.com/view/hidimle.... Come by at 9 am!
Excited to be at the APS March Meeting this year! @apsphysics.bsky.social
I'll be giving a talk in the Tues afternoon session MAR-J58, Physics of Learning & Adaptation I.