Linear Script Representations in Speech Foundation Models Enable Zero-Shot Transliteration
Multilingual speech foundation models such as Whisper are trained on web-scale data, where data for each language consists of a myriad of regional varieties. However, different regional varieties ofte...
Paper link here: arxiv.org/abs/2601.02906
Joint work with @juice500ml.bsky.social, @kalvinchang.bsky.social, Ming-Hao Hsu, @florian-eichin.com, Zhizheng Wu, Alane Suhr, @mhedderich.bsky.social, David Harwath, @davidrmortensen.bsky.social, and @barbaraplank.bsky.social!
07.01.2026 03:12
π 0
π 0
π¬ 0
π 0
But how well does it perform? Up to almost 75% accuracy under a zero-shot setting, suggesting that romanization direction may be broadly similar across languages. Furthermore, our induced transliterations often appear more faithful to the pronunciation compared to the deterministic ground truth.
07.01.2026 03:11
π 0
π 0
π¬ 1
π 0
We then find that our Latin and Cyrillic directions can be added to activations in other languages at test time to induce transliteration. Romanization examples below:
07.01.2026 03:09
π 0
π 0
π¬ 1
π 0
We first apply this on Serbian in Latin and Cyrillic characters, and Chinese in simplified and traditional characters. In both cases, we outperform a prompting baseline for script control in many cases.
07.01.2026 03:08
π 0
π 0
π¬ 1
π 0
Our method is simple. We collect activations in source and target script, take their difference to isolate script info, then add to activations at test time to induce script change. Similar to the king - man + woman = queen vector arithmetic.
07.01.2026 03:06
π 1
π 0
π¬ 1
π 0
β¨New paperβ¨
We find script (e.g. Cyrillic, Latin) to be a linear direction in the activation space of Whisper, enabling transliteration at test-time by adding such script directions to the activations β producing e.g. Cyrillic Japanese transcriptions.
07.01.2026 03:04
π 9
π 4
π¬ 1
π 0
Welcome to the First Workshop on Bridging NLP and Public Opinion Research, co-located with COLM 2025, October 10, 2025, Montreal, Canada.
ποΈ Excited to announce the 1st Workshop on Bridging NLP and Public Opinion Research at COLM 2025, Oct 10th in Montreal π¨π¦
As LLMs reshape public discourse and research, collaboration between NLP and Public Opinion Research (POR) is more vital than ever #NLPOR Submit by June 23π
π tinyurl.com/nlpor25
16.05.2025 13:23
π 18
π 10
π¬ 1
π 1
The hand-drawn sign from three years ago.
πMaiNLP is turning 3 today!ππ₯³ Weβve grown a lot since @barbaraplank.bsky.social started this group with nothing but three aspiring researches and a hand-drawn sign on the door. Huge thanks to all the amazing people who have joined or visited us since. Hereβs to many more years of exciting research!π
01.04.2025 10:40
π 20
π 9
π¬ 1
π 2
A bookshelf filled with various books about gesture, with a prominent book in the center titled 'Gesture: A Slim Guide' by Lauren Gawne. The book cover features a black line illustration of a person with abstract representation of eight different hands doing gestures.
It's publication day for Gesture: A Slim Guide
If you have been wanting to think about gesture in your own research, bring it into your teaching or connect with the field of Gesture Studies, this is for you. It's under 50k words and has a nifty glossary too.
24.03.2025 22:17
π 232
π 49
π¬ 11
π 10
Language use is language changeβevery new conversation that is signed or spoken and every new sentence that is written is an incremental amendment in the social contract that binds a language community together.
12.02.2025 13:18
π 10
π 1
π¬ 0
π 0
half of those keywords would be in any dialect NLP thesis lmao π₯²
04.02.2025 18:42
π 0
π 0
π¬ 0
π 0