LLMs are nothing more than models of the distribution of the word forms in their training data, with weights modified by post-training to produce somewhat different distributions.
LLMs are nothing more than models of the distribution of the word forms in their training data, with weights modified by post-training to produce somewhat different distributions.
The AI discourse sometimes seems to center on "Is AI good or is it bad?"
I find this framing unproductive. AI is not a fixed thing.
I would prefer to ask "How might we use this technology for good, and mitigate the bad?"
What a shame if the best use we can come up with is no use at all.
To kick off the PhD journey with @pseudomanifold.topology.rocks:
What are the limitations of the WL metric, and what is an ๐ช๐ฏ๐ง๐ฐ๐ณ๐ฎ๐ข๐ต๐ช๐ท๐ฆ ๐ฎ๐ฆ๐ต๐ณ๐ช๐ค?
We answer these questions with our ๐๐ฟ๐ฎ๐ฝ๐ต ๐๐ผ๐บ๐ผ๐บ๐ผ๐ฟ๐ฝ๐ต๐ถ๐๐บ ๐๐ถ๐๐๐ผ๐ฟ๐๐ถ๐ผ๐ป
arxiv.org/abs/2511.03068
@olgatticus.bsky.social, Kavir and @erikjbekkers.bsky.social
"He is from [MASK] [MASK]" โ "San York"? dLLMs fail because they ignore token dependencies. This Factorization Barrier arises from a structural misspecification: models are restricted to fully factorized outputs. We break this barrier with CoDD, enabling coherent parallel generation. ๐
Sam is a snake
time traveler from 12 months from now just sent me this
In light of the current funding situation (worldwide), a modest proposal: instead of pouring billions of dollars into GenAI claiming "it *could* accelerate science and research," consider putting 1% of that amount in what *will* accelerate science and research. Namely, funding science and research.
why do science? it won,t make the model Bigger
I spent way too long trying to understand stop gradients lol arxiv.org/abs/2104.00428 (see the first appendix).
I'd argue it is about the loss, but rather you're defining a surrogate loss that should optimise the true loss you're interested in.
X is hiring a creative writing specialist at $40 an hour to make Grok better at writing and a true LOL at the qualifications
New open source: cuthbert ๐
State space models with all the hotness: (temporally) parallelisable, JAX, Kalman, SMC
Best conference with the best people and in the best place ๐ ๐
Also the submission deadline is conveniently one month later than #ICML2026, just in case you needed it ๐
๐ฆThe 20th conference on Neurosymbolic AI will be in Lisbon, Portugal, September 1-4, 2026!
The CFP is out: 2026.nesyconf.org/call-for-pap... with two phases:
๐จ Deadline 1: Feb 24 (abstract), Mar 3 (full)
๐จ Deadline 2: Jun 9 (abstract), Jun 16 (full)
#neurosymbolic #NeSy2026
We introduce epiplexity, a new measure of information that provides a foundation for how to select, generate, or transform data for learning systems. We have been working on this for almost 2 years, and I cannot contain my excitement! arxiv.org/abs/2601.03220 1/7
Good call! I maintain a list of Neurosymbolic folks on Bsky, see here ๐ฆ
go.bsky.app/RMJ8q3i
I am recruiting 1 PhD student (4-year position) and 2 postdocs (3-year positions) to work on logic and machine learning at the University of Helsinki:
- PhD 1: jobs.helsinki.fi/job/Helsinki...
- Postdoc 1: jobs.helsinki.fi/job/Helsinki...
-Postdoc 2: jobs.helsinki.fi/job/Helsinki...
#XAI, #neurosymbolic methods #nesy and #causal #representation #learning #CRL all care about learning #interpretable #concepts, but in different ways.
We are organizing this #ICLR2026 workshop to bring these three communities together and learn from each other ๐ฆพ๐ฅ๐ฅ
Submission deadline: 30 Jan 2026
Thanks for the fantastic talk, and totally agree! (Writing this in the train from Copenhagen :-))
Emile will present our work on Knowledge Graph Embeddings at Eurips' Salon des Refusรฉs on Friday!
We show how linearity prevent KGEs from scaling to larger graphs + propose a simple solution using a Mixture of Softmaxes (see LLM literature) to break the limitations at a low parameter cost. ๐จ
Recordings of the NeSy 2025 keynotes are now available! ๐ฅ
Check out insightful talks from @guyvdb.bsky.social, @tkipf.bsky.social and D McGuinness on our new Youtube channel www.youtube.com/@NeSyconfere...
Topics include using symbolic reasoning for LLM, and object-centric representations!
๐จ New paper alert!
We introduce Vision-Language Programs (VLP), a neuro-symbolic framework that combines the perceptual power of VLMs with program synthesis for robust visual reasoning.
Interested in meeting up in Copenhagen? Do shoot a message!
And finally #3
๐จ Rank bottlenecks in KGEs:
At Friday's "Salon des Refuses" I will present @sbadredd.bsky.social 's new work on how rank bottlenecks limit knowledge graph embeddings
arxiv.org/abs/2506.22271
#2
๐ GRAPES: At Tuesday's ELLIS Unconference poster session.
We study adaptive graph sampling for scaling GNNs!
Work with Taraneh Younesian, Daniel Daza, @thiviyan.bsky.social, @pbloem.sigmoid.social.ap.brid.gy
arxiv.org/abs/2310.03399
Almost off to @euripsconf.bsky.social in Copenhagen ๐ฉ๐ฐ ๐ช๐บ! I'll present 3 posters:
๐ง Neurosymbolic Diffusion Models: Thursday's poster session.
Going to NeurIPS? @edoardo-ponti.bsky.social and @nolovedeeplearning.bsky.social will present the paper in San Diego Thu 13:00
arxiv.org/abs/2505.13138
The simplex algorithm is super efficient. 80 years of experience says it runs in linear time. Nobody can explain _why_ it is so fast.
We invented a new algorithm analysis framework to find out.
Precies hetzelfde hier...
Want to use your favourite #NeSy model but afraid of the reasoning shortcuts?๐ซฃ
Fear not๐ช๐ปIn our #NeurIPS2025 paper we show that you just need to equip your favourite NeSy model with prototypical networks and the reasoning shortcuts will be a problem of the past!
I'm in Suzhou to present our work on MultiBLiMP, Friday @ 11:45 in the Multilinguality session (A301)!
Come check it out if your interested in multilingual linguistic evaluation of LLMs (there will be parse trees on the slides! There's still use for syntactic structure!)
arxiv.org/abs/2504.02768
๐Introducing BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data!
LLMs learn from vastly more data than humans ever experience. BabyLM challenges this paradigm by focusing on developmentally plausible data
We extend this effort to 45 new languages!