Andrew Chang's Avatar

Andrew Chang

@candrew123

Job: researcher at Google; Training: auditory cognitive neuroscience. πŸ‡ΉπŸ‡ΌπŸ‡¨πŸ‡¦πŸ‡ΊπŸ‡Έ

269
Followers
98
Following
39
Posts
13.10.2023
Joined
Posts Following

Latest posts by Andrew Chang @candrew123

Preview
a google logo with a white background and a rainbow of colors ALT: a google logo with a white background and a rainbow of colors

Some personal news: I'm excited to announce I'm joining #Google as a Senior Quant UXR!

Deeply thankful for my time in academia and for every person in my network who has helped and blessed me along the way. I'm grateful incredibly excited for this new opportunity at Google.

03.11.2025 01:31 πŸ‘ 7 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Huge thanks to co-authors @yikeli.bsky.social, Iran R. Roman, @davidpoeppel.bsky.social, and to the Interspeech reviewers for the perfect 4/4 score! πŸ™Œ

Can’t wait to present and discuss how this bridges machine and human perception! See you in Rotterdam!

02.06.2025 19:00 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

πŸ’₯ Key Impact 3:
This paves the way for advances in #CognitiveComputing and audio-related brain–computer (#BCI) applications (e.g., sound/speech reconstruction).

02.06.2025 19:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸ’₯ Key Impact 2:
STM features link directly to brain processing, offering a more interpretable, biologically grounded representation.

02.06.2025 19:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸ’₯ Key Impact 1:
Without any pretraining, our STM-based DNN matches popular spectrogram-based models on speech, music, and environmental sound classification.

02.06.2025 19:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

While spectrogram-based audio DNNs excel, they’re often bulky, compute-heavy, hard to interpret, and data-hungry.
We explored an alternative: training a DNN on spectrotemporal modulation (#STM) featuresβ€”an approach inspired by how the human auditory cortex processes sound.

02.06.2025 19:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Spectrotemporal Modulation: Efficient and Interpretable Feature Representation for Classifying Speech, Music, and Environmental Sounds Audio DNNs have demonstrated impressive performance on various machine listening tasks; however, most of their representations are computationally costly and uninterpretable, leaving room for optimiza...

I’m excited to share one of two papers accepted to #Interspeech2025! @interspeech.bsky.social

β€œSpectrotemporal Modulation: Efficient & Interpretable Feature Representation for Classifying Speech, Music & Environmental Sounds”
πŸ“„ Paper: arxiv.org/abs/2505.23509
#NeuroInspiredML #AudioAI

02.06.2025 19:00 πŸ‘ 3 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Our Interspeech2025 contrib (for geeks)
arxiv.org/pdf/2505.23509
Audio DNNs: impressive performance on machine listening tasks. But most representations are computationally costly & uninterpretable. Let's try something different:

31.05.2025 14:36 πŸ‘ 14 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0
Preview
What cities can learn from the brain - Nature Human Behaviour Given its ability to manage a multitude of functions in support of survival, the dynamics and organization of the brain offer the city β€” another confluence of structures and processes β€” lessons for ur...

neuroscience x urban design? fascinating
www.nature.com/articles/s41...

24.04.2025 13:42 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

why DO babies dance? when do they start dancing? what counts as dancing, anyway (and how can we measure it)? out online today in CDPS, @lkcirelli.bsky.social and i attempt to integrate what is known about the development of dance
journals.sagepub.com/doi/epub/10.... (2/4)

14.03.2025 16:39 πŸ‘ 10 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
How Germany's elite research institution fails young scientists | DW Documentary
How Germany's elite research institution fails young scientists | DW Documentary YouTube video by DW Documentary

www.youtube.com/watch?v=n5nE...
Important and painful

13.03.2025 20:58 πŸ‘ 72 πŸ” 24 πŸ’¬ 0 πŸ“Œ 6
Preview
Visual adaptation stronger at horizontal than vertical meridian: Linking performance with V1 cortical surface area Visual adaptation, a mechanism that conserves bioenergetic resources by reducing energy expenditure on repetitive stimuli, leads to decreased sensitivity for similar features (e.g., orientation and sp...

Our new preprint is now on bioRxiv!
'Visual adaptation stronger at horizontal than vertical meridian: Linking performance with V1 cortical surface area'
www.biorxiv.org/content/10.1...

10.03.2025 21:50 πŸ‘ 8 πŸ” 3 πŸ’¬ 2 πŸ“Œ 0

I have emailed @interspeech.bsky.social, but it would be great if you could also reach out to them at pco@interspeech2025.org if this concerns you as well, so they understand that this will affect many people. I’m sure none of us want to be stuck writing a rebuttal in a hotel at #ICASSP!

12.03.2025 17:26 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

@interspeech.bsky.social just changed its rebuttal period to April 4-11, which overlaps with #ICASSP.

Given the overlap in research communities, I believe many researchers who submitted to #Interspeech2025 will also be attending #ICASSP2025. Could it be at least a week later?

12.03.2025 17:11 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

What's next? We are currently working on (1) refining our ML model by combining active learning and semi-supervised learning approaches and (2) experimenting with new human-computer interaction designs to mitigate negative experiences during videoconferencing. 7/end

10.03.2025 19:24 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Beyond improving technical aspects like signal quality and latency of a videoconferencing system, social dynamics can deeply affect user experience. Our research paves the way for future enhancements by predicting and preventing conversational derailments in real time.
6/n

10.03.2025 19:24 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

One surprising insight: awkward silencesβ€”those long gaps in turn-takingβ€”were more detrimental to conversational fluidity and enjoyment than chaotic overlaps or interruptions.
5/n

10.03.2025 19:24 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

We used multimodal ML on 100+ person-hours of videoconferences, modeling voice, facial expressions, and body movements. Key result: ROC-AUC 0.87 in predicting unfluid and unenjoyable moments and classifying various disruptive events, such as gaps and interruptions.
4/n

10.03.2025 19:24 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Videoconferencing has become essential in our professional and personal lives, especially post-pandemic. Yet, we've all experienced the β€œderailed” moments, such as awkward pauses and uncoordinated turn-taking, and that can make virtual meetings less effective and enjoyable.
3/n

10.03.2025 19:24 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Can AI Tell Us if Those Zoom Calls Are Flowing Smoothly? New Study Gives a Thumbs Up Researchers find machine learning can predict how we rate social interactions in videoconference conversations

See my thread below, and also this press release: www.nyu.edu/about/news-p...
2/n

10.03.2025 19:24 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Multimodal Machine Learning Can Predict Videoconference Fluidity and Enjoyment Videoconferencing is now a frequent mode of communication in both professional and informal settings, yet it often lacks the fluidity and enjoyment of in-person conversation. This study leverages mult...

Excited to share that our paper, "Multimodal Machine Learning Can Predict Videoconference Fluidity and Enjoyment," has been accepted for an **oral presentation** at #ICASSP! ieeexplore.ieee.org/document/108...
@dustinfreeman.bsky.social @davidpoeppel.bsky.social
1/n

10.03.2025 19:24 πŸ‘ 2 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0
Perception of pitch is culturally influenced Study on cross-cultural music perception published in Current Biology

There is an excellent cross-cultural study on this topic by @norijacoby.bsky.social. A lay summary of the paper can be found here: www.aesthetics.mpg.de/en/research/...

21.02.2025 20:05 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Thanks for your comment. Yes there are several recent studies suggesting that chroma is not really an innate or universal property of pitch perception. Our study cannot answer this question, but we indeed found that the effect of chroma is much weaker than height.

21.02.2025 19:56 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

In short: By combining machine learning and MEG, we show how the brain’s dynamic pitch representation echoes ideas proposed over 100 years ago. Feels like completing a full circle in music cognitive neuroscience! Huge thanks to my collaborators! End/n

19.02.2025 20:18 πŸ‘ 9 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image

The helix model reflects the idea that pitches separated by an octave (e.g., the repeating piano keys) are perceived as inherently similar. This concept was first explored in the early 1900s by Géza Révész, laying the groundwork for modern music cognition! 🧠🎹 6/n

19.02.2025 20:18 πŸ‘ 8 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

The brain doesn’t process pitch in an unstructured way. Typically, it represents pitches in a mostly linear structureβ€”think piano keyboard layout. BUTβ€”just 0.3 seconds after hearing a sound, something wild happens: the brain briefly represents pitch in a helix-like structure! 5/n

19.02.2025 20:18 πŸ‘ 16 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0
Video thumbnail

This animation shows the reconstruction of how the brain dynamically represents musical pitches. The pitches that are closer in space are perceived as more similar at a given moment. 4/n

19.02.2025 20:18 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

We used machine learning to decode how the brain represents musical pitches during an #MEG scan. Our model reconstructed how the brain represents the similarity between different pitches and how this representation changes over time. 3/n

19.02.2025 20:18 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Why does pitch matter? It’s essential not just for music, but for speech perception & sound segregation too! Understanding how our brain dynamically encodes pitch is a major research in auditory cognitive neuroscience. 2/n

19.02.2025 20:18 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Temporally Dissociable Neural Representations of Pitch Height and Chroma The extraction and analysis of pitch underpin speech and music recognition, sound segregation, and other auditory tasks. Perceptually, pitch can be represented as a helix composed of two factors: heig...

Excited to kick off 2025 with new research in #MachineLearning, #Decoding, #MusicNeuroscience! Our paper, β€œTemporally Dissociable Neural Representations of Pitch Height and Chroma”, now in
@sfnjournals.bsky.social
doi.org/10.1523/JNEU...
@davidpoeppel.bsky.social, @xiangbin-teng.bsky.social! 🧠🎡 1/n

19.02.2025 20:18 πŸ‘ 26 πŸ” 12 πŸ’¬ 3 πŸ“Œ 1