a google logo with a white background and a rainbow of colors
ALT: a google logo with a white background and a rainbow of colors
Some personal news: I'm excited to announce I'm joining #Google as a Senior Quant UXR!
Deeply thankful for my time in academia and for every person in my network who has helped and blessed me along the way. I'm grateful incredibly excited for this new opportunity at Google.
03.11.2025 01:31
π 7
π 0
π¬ 1
π 0
Huge thanks to co-authors @yikeli.bsky.social, Iran R. Roman, @davidpoeppel.bsky.social, and to the Interspeech reviewers for the perfect 4/4 score! π
Canβt wait to present and discuss how this bridges machine and human perception! See you in Rotterdam!
02.06.2025 19:00
π 1
π 0
π¬ 0
π 0
π₯ Key Impact 3:
This paves the way for advances in #CognitiveComputing and audio-related brainβcomputer (#BCI) applications (e.g., sound/speech reconstruction).
02.06.2025 19:00
π 0
π 0
π¬ 1
π 0
π₯ Key Impact 2:
STM features link directly to brain processing, offering a more interpretable, biologically grounded representation.
02.06.2025 19:00
π 0
π 0
π¬ 1
π 0
π₯ Key Impact 1:
Without any pretraining, our STM-based DNN matches popular spectrogram-based models on speech, music, and environmental sound classification.
02.06.2025 19:00
π 0
π 0
π¬ 1
π 0
While spectrogram-based audio DNNs excel, theyβre often bulky, compute-heavy, hard to interpret, and data-hungry.
We explored an alternative: training a DNN on spectrotemporal modulation (#STM) featuresβan approach inspired by how the human auditory cortex processes sound.
02.06.2025 19:00
π 0
π 0
π¬ 1
π 0
Our Interspeech2025 contrib (for geeks)
arxiv.org/pdf/2505.23509
Audio DNNs: impressive performance on machine listening tasks. But most representations are computationally costly & uninterpretable. Let's try something different:
31.05.2025 14:36
π 14
π 4
π¬ 1
π 0
why DO babies dance? when do they start dancing? what counts as dancing, anyway (and how can we measure it)? out online today in CDPS, @lkcirelli.bsky.social and i attempt to integrate what is known about the development of dance
journals.sagepub.com/doi/epub/10.... (2/4)
14.03.2025 16:39
π 10
π 1
π¬ 1
π 0
How Germany's elite research institution fails young scientists | DW Documentary
YouTube video by DW Documentary
www.youtube.com/watch?v=n5nE...
Important and painful
13.03.2025 20:58
π 72
π 24
π¬ 0
π 6
I have emailed @interspeech.bsky.social, but it would be great if you could also reach out to them at pco@interspeech2025.org if this concerns you as well, so they understand that this will affect many people. Iβm sure none of us want to be stuck writing a rebuttal in a hotel at #ICASSP!
12.03.2025 17:26
π 0
π 0
π¬ 0
π 0
@interspeech.bsky.social just changed its rebuttal period to April 4-11, which overlaps with #ICASSP.
Given the overlap in research communities, I believe many researchers who submitted to #Interspeech2025 will also be attending #ICASSP2025. Could it be at least a week later?
12.03.2025 17:11
π 0
π 0
π¬ 1
π 0
What's next? We are currently working on (1) refining our ML model by combining active learning and semi-supervised learning approaches and (2) experimenting with new human-computer interaction designs to mitigate negative experiences during videoconferencing. 7/end
10.03.2025 19:24
π 1
π 0
π¬ 0
π 0
Beyond improving technical aspects like signal quality and latency of a videoconferencing system, social dynamics can deeply affect user experience. Our research paves the way for future enhancements by predicting and preventing conversational derailments in real time.
6/n
10.03.2025 19:24
π 0
π 0
π¬ 1
π 0
One surprising insight: awkward silencesβthose long gaps in turn-takingβwere more detrimental to conversational fluidity and enjoyment than chaotic overlaps or interruptions.
5/n
10.03.2025 19:24
π 0
π 0
π¬ 1
π 0
We used multimodal ML on 100+ person-hours of videoconferences, modeling voice, facial expressions, and body movements. Key result: ROC-AUC 0.87 in predicting unfluid and unenjoyable moments and classifying various disruptive events, such as gaps and interruptions.
4/n
10.03.2025 19:24
π 0
π 0
π¬ 1
π 0
Videoconferencing has become essential in our professional and personal lives, especially post-pandemic. Yet, we've all experienced the βderailedβ moments, such as awkward pauses and uncoordinated turn-taking, and that can make virtual meetings less effective and enjoyable.
3/n
10.03.2025 19:24
π 0
π 0
π¬ 1
π 0
Perception of pitch is culturally influenced
Study on cross-cultural music perception published in Current Biology
There is an excellent cross-cultural study on this topic by @norijacoby.bsky.social. A lay summary of the paper can be found here: www.aesthetics.mpg.de/en/research/...
21.02.2025 20:05
π 1
π 0
π¬ 0
π 0
Thanks for your comment. Yes there are several recent studies suggesting that chroma is not really an innate or universal property of pitch perception. Our study cannot answer this question, but we indeed found that the effect of chroma is much weaker than height.
21.02.2025 19:56
π 0
π 0
π¬ 0
π 0
In short: By combining machine learning and MEG, we show how the brainβs dynamic pitch representation echoes ideas proposed over 100 years ago. Feels like completing a full circle in music cognitive neuroscience! Huge thanks to my collaborators! End/n
19.02.2025 20:18
π 9
π 0
π¬ 0
π 0
The helix model reflects the idea that pitches separated by an octave (e.g., the repeating piano keys) are perceived as inherently similar. This concept was first explored in the early 1900s by GΓ©za RΓ©vΓ©sz, laying the groundwork for modern music cognition! π§ πΉ 6/n
19.02.2025 20:18
π 8
π 0
π¬ 1
π 0
The brain doesnβt process pitch in an unstructured way. Typically, it represents pitches in a mostly linear structureβthink piano keyboard layout. BUTβjust 0.3 seconds after hearing a sound, something wild happens: the brain briefly represents pitch in a helix-like structure! 5/n
19.02.2025 20:18
π 16
π 2
π¬ 1
π 0
This animation shows the reconstruction of how the brain dynamically represents musical pitches. The pitches that are closer in space are perceived as more similar at a given moment. 4/n
19.02.2025 20:18
π 2
π 0
π¬ 1
π 0
We used machine learning to decode how the brain represents musical pitches during an #MEG scan. Our model reconstructed how the brain represents the similarity between different pitches and how this representation changes over time. 3/n
19.02.2025 20:18
π 1
π 0
π¬ 1
π 0
Why does pitch matter? Itβs essential not just for music, but for speech perception & sound segregation too! Understanding how our brain dynamically encodes pitch is a major research in auditory cognitive neuroscience. 2/n
19.02.2025 20:18
π 3
π 0
π¬ 1
π 0
Temporally Dissociable Neural Representations of Pitch Height and Chroma
The extraction and analysis of pitch underpin speech and music recognition, sound segregation, and other auditory tasks. Perceptually, pitch can be represented as a helix composed of two factors: heig...
Excited to kick off 2025 with new research in #MachineLearning, #Decoding, #MusicNeuroscience! Our paper, βTemporally Dissociable Neural Representations of Pitch Height and Chromaβ, now in
@sfnjournals.bsky.social
doi.org/10.1523/JNEU...
@davidpoeppel.bsky.social, @xiangbin-teng.bsky.social! π§ π΅ 1/n
19.02.2025 20:18
π 26
π 12
π¬ 3
π 1