Stefan Hartmann presenting a slide
Happening right now — @stefanhartmann.bsky.social presenting an extremely interesting case study on snowclones like »x is the new y«. 🗣️
@clausebielefeld
CompLing group (CLAUSE) at Bielefeld U (PI: Sina Zarrieß). We work on: NLG, Language & Vision, Pragmatics & Dialogue, HateSpeech, BabyLMs, DH, and more! clause-bielefeld.github.io
Stefan Hartmann presenting a slide
Happening right now — @stefanhartmann.bsky.social presenting an extremely interesting case study on snowclones like »x is the new y«. 🗣️
Tomorrow!
I have just returned from a week-long visit to Bielefeld University! Thank you very much for hosting me Sina Zarrieß and @ozgealacam.bsky.social 😊 @clausebielefeld.bsky.social
This week we’re having @ecekt.bsky.social as our guest in Bielefeld. She gave a highly timely talk on language+vision models, how they process images under noise conditions, and about how to train a highly effective multimodal BabyLM with model merging. 🗣️👀💻
For years since the GPT-2 paper, emergent in-context learning (ICL) from 'next-token' training has been treated as something deeply tied to 𝐡𝐮𝐦𝐚𝐧 𝐥𝐚𝐧𝐠𝐮𝐚𝐠𝐞. But … is it?
AI generated image
Am I evil? Am I likeable?
Need a 10 minutes break? Like Fantasy? Loath it? Take part in our study and help us by rating images of fictional characters here:
bixprag.lili.uni-bielefeld.de/publix/0aSWK...
For this week’s group colloquium, we invited Loulou Kosmala from Paris-Est Créteil University. She gave a talk on multimodal feedback during all types of conversation, from real life to virtual, from learners to adults, from L1 to L2, and more! 🤩
Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning) Francesca Padovani1∗ Bastian Bunzeck2∗ Manar Ali2 Omar Momen2 Arianna Bisazza1 Hendrik Buschmeier2 Sina Zarrieß2 1Center for Language and Cognition (CLCG), University of Groningen 2CRC 1646 – Linguistic Creativity in Communication, Bielefeld University f.padovani@rug.nl bastian.bunzeck@uni-bielefeld.de
As part of this year's BabyLM challenge, we (researchers from @gronlp.bsky.social and @clausebielefeld.bsky.social diverged from established pretraining paradigm by training only on dialogue data from CHILDES.
Preprint alert! We release BabyBabelLM, a multilingual benchmark of developmentally plausible training data. I was responsible for German and Polish data as well as various child-directed wikis. Immensely rewarding project with exceptionally cool co-authors. 🥳🚀
𝐃𝐨 𝐲𝐨𝐮 𝐫𝐞𝐚𝐥𝐥𝐲 𝐰𝐚𝐧𝐭 𝐭𝐨 𝐬𝐞𝐞 𝐰𝐡𝐚𝐭 𝐦𝐮𝐥𝐭𝐢𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐞𝐟𝐟𝐨𝐫𝐭 𝐥𝐨𝐨𝐤𝐬 𝐥𝐢𝐤𝐞? 🇨🇳🇮🇩🇸🇪
Here’s the proof! 𝐁𝐚𝐛𝐲𝐁𝐚𝐛𝐞𝐥𝐋𝐌 is the first Multilingual Benchmark of Developmentally Plausible Training Data available for 45 languages to the NLP community 🎉
arxiv.org/abs/2510.10159
Happening in an hour! 🥳
If you are at #IWCS, then you should not miss Sanne‘s talk ”Not Just Who or What: Modeling the Interaction of Linguistic and Annotator Variation in Hateful Word Interpretation“ (Sanne Hoeken, Özge Alacam, Dong Nguyen, Massimo Poesio, Sina Zarrieß), tomorrow at 16:30! 🕟
@sannehoeken.bsky.social
Sina in front of a slide with different size circles
Sina Zarieß is giving the KONVENS keynote on training BabyLMs #nlproc
The slide shows the number of words a 12yo human has seen in their lifetime compared to the numbers of words typical language models have seen in training #llm
Happening now: Sina‘s keynote on our BabyLM work. 🥳
Great first day at #KONVENS2015 today. Looking forward to another engaging day with a keynote by Sina Zarrieß tomorrow 🤓
@clausebielefeld.bsky.social
Don’t miss Sina‘s keynote on BabyLMs at #konvens tomorrow!
Final Keynote of #semdial by David Schlangen on ”Meaningful Interaction with Unreal Speakers?“ 😇💬
Final day at #semdial2025 #bialogue — four more presentations, one key note and hopefully many engaging discussions. Let's go!
Second #semdial keynote by Robert Hawkins on ”Foraging for common ground“
Day 2 of #semdial starts with a session on LMs and dialogue systems 🤩
Actually yes! Dialogue differs distinctly from monologues in terms of phonetic features and in the production of novel phonetic forms!
Leonie Schade asks whether it takes two to do an articulatory tango 😁
And the second talk features contributions by our PI Sina Zarrieß. 🤩
#semdial has begun 💬
#semdial is about to begin 🥳
Program: semdial2025.github.io/program/
Proceedings: purl.org/semdial/2025...
#semdial2025, the long-awaited #bialogue conference starts tomorrow! We are looking forward to three wonderful conference days, featuring three exciting keynotes, and many oral and poster presentations on the semantics and pragmatics of dialogue. 👄💬
Check out the program and proceedings below. 👇
Let’s go!
Is simpler child-directed language easier to learn?
Check out our CoNLL paper "Do Construction Distributions Shape Formal Language Learning in German BabyLMs?"
@conll-conf.bsky.social