Aaron Wenteler (@aaronw3r)

The premier conference on Machine Learning for Computational Biology is Sep 9-10 at the NY Genome Center in NYC!

Submission deadline is June 1 for 2-page abstracts and 8-page papers (eligible for proceedings track).

Registration is now open! (Link below)

Please retweet!

16.05.2025 11:26 👍 28 🔁 13 💬 2 📌 0

13/13 Thanks to all the amazing collaborators: Martina Occhetta, Nik Branson, Magdalena Huebner, Victor Curean, Will Dee, Will Connell, Alex Hawkins-Hooker, Pui Chung, Yasha Ektefaie, Amaya Gallagher-Syed (@amayags.bsky.social) and César Córdova.

02.05.2025 09:21 👍 0 🔁 0 💬 0 📌 0

GitHub - aaronwtr/PertEval: Evaluation suite for transcriptomic perturbation effect prediction models. Includes support for single-cell foundation models. Evaluation suite for transcriptomic perturbation effect prediction models. Includes support for single-cell foundation models. - aaronwtr/PertEval

12/13 We plan to maintain and expand PertEval, creating a comprehensive benchmarking suite for the research community. Community contributions are very much encouraged!

Paper 📃: www.biorxiv.org/content/10.1...
GitHub 💻: github.com/aaronwtr/Per...

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

11/13 Looking ahead, we believe progress in this field will specifically require two key elements:
- Higher-quality data spanning a wider range of cellular states and perturbations
- Specialized models designed to fully leverage large-scale datasets for perturbation prediction

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

10/13 These findings highlight important challenges in using scFMs for perturbation effect prediction. While scFMs have potential, our results suggest that current models aren't yet optimized for this specific task.

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

9/13 Our analysis revealed that all models struggle to predict strong or atypically distributed perturbations and mostly learn average perturbation effects in a zero-shot setting. This highlights the need for training data that better represents cellular states and responses to perturbations.

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

8/13 Many perturbation prediction evaluations use 2,000 HVGs, while most genes don't show a strong response. However, even when narrowing down to the top 20 DEGs per perturbation, some scFM embeddings only slightly outperformed the baseline methods, while others still didn't.

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

7/13 We found that current-generation zero-shot scFM embeddings showed no significant improvement over task-specific model GEARS or even over simple baselines when predicting perturbation effects across 2,000 highly variable genes (HVGs).

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

6/13 On top of this, our framework also considers distribution shift, a frequently overlooked factor. We applied PertEval to evaluate zero-shot embeddings from several scFMs: scBERT, Geneformer, scGPT, scFoundation, and UCE.

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

5/13 PertEval-scFM includes three metrics:

- Area Under the SPECTRA Performance Curve (AUSPC)
- E-distance
- Pre-train / fine-tune cosine similarity (contextual alignment)

Each metric provides unique insights into model behaviour and robustness.

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

4/13 Our framework introduces a standardized toolkit of metrics designed to provide a nuanced evaluation of perturbation effect prediction model performance. Such a framework facilitates meaningful comparisons across different approaches and datasets.

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

3/13 Currently, there's no agreement on how to compare different approaches for perturbation effect prediction. This makes it challenging to determine which models truly perform best, or to identify areas for improvement. PertEval-scFM aims to change that.

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

2/13 With the rapid rise of models and scFMs for this task, it's more important than ever to have standardized evaluation methods. PertEval-scFM provides a comprehensive framework to assess these AI models in predicting cellular responses to genetic perturbations.

02.05.2025 09:21 👍 0 🔁 0 💬 1 📌 0

1/13 Excited to share that PertEval-scFM got accepted into ICML 2025 🇨🇦!

We provide benchmark and evaluation tools for perturbation effect prediction models, including single-cell foundation models (scFMs). Paper and GitHub link at the end of the thread! 🧵👇

02.05.2025 09:21 👍 1 🔁 0 💬 1 📌 0

Thank you for your great work. We have a copy here at the office proudly on display 😎

27.03.2025 16:47 👍 1 🔁 0 💬 0 📌 0

Am I the only one who feels GPT 4.5 is actually worse than 4o? Its prompt adherence seems to be bad in my experience. Thought that might be beneficial for creative tasks, but even there, I feel like the outputs it generates are underwhelming compared to 4o

11.03.2025 10:55 👍 0 🔁 0 💬 0 📌 0

🔥Our paper "BioX-CPath: Biologically-driven explainable Diagnostics for Multistain IHC omputational Pathology" was accepted at #CVPR2025!

🚀We'll be releasing the paper and code repo ASAP, stay tuned! #multistain #IHC #pathology #ExplainableAI #GNNs #PrecisionMedicine #Immunolog

28.02.2025 15:06 👍 5 🔁 1 💬 1 📌 0

Announcing Evo 2: The largest publicly available, AI model for biology to date, capable of understanding and designing genetic code across all three domains of life. t.co/1Zt6gQ74SA

19.02.2025 16:30 👍 41 🔁 17 💬 3 📌 3

Attended an incredible talk by @philipcball.bsky.social in Oxford. He covered a big chunk of genetics and molecular biology at lightning speed without compromising on clarity. He also convincingly explained why the central dogma is outdated. Go check out his latest book, How Life Works

15.02.2025 14:10 👍 26 🔁 6 💬 1 📌 0

PertEval-scFM: Benchmarking Single-Cell Foundation Models for Perturbation Effect Prediction YouTube video by Valence Labs

It was a pleasure talking about our recent single-cell foundation model benchmark, PertEval-scFM, at the Multiomics Reading Group at Mila. Many thanks to the organizers for the invitation and to @valenceai.bsky.social for sharing the talk. Check it out here: youtu.be/DCezfwQkkAE?...

04.02.2025 12:36 👍 0 🔁 0 💬 0 📌 0

Excited to attend this, looking forward to it!

29.01.2025 12:31 👍 1 🔁 0 💬 0 📌 0

Genes & Health is excited to contribute 55,000 high-quality exomes from British South Asian volunteers to gnomAD, a global genetic resource. This open-access data will advance rare disease diagnosis and treatment, thanks to our amazing volunteers. #GenesAndHealth #Genomics #RareDiseases

27.01.2025 16:34 👍 7 🔁 3 💬 0 📌 0

A month ago we @vevotherapeutics.bsky.social announced that we have generated the largest single-cell perturbation atlas in history, Tahoe-100M. Today, we announce that we will fully open-source Tahoe-100M in Feb, as part of a collaboration with NVidia health to train cell state models.

13.01.2025 16:23 👍 116 🔁 33 💬 4 📌 5

How can we build an Al virtual cell that simulates all functions and interactions of a cell? How will it transform research and drive breakthroughs in programmable biology, drug discovery and personalized medicine?

Take a look at our paper in @cellpress.bsky.social!
www.cell.com/cell/fulltex...

12.12.2024 20:09 👍 89 🔁 21 💬 3 📌 3

My first time submitting to a big ML conference. Very frustrating experience after having worked really hard to address all the reviewers’ concerns only to be met with silence once we completed and shared the results. Hoping the meta-reviews will be better

05.12.2024 00:15 👍 2 🔁 0 💬 0 📌 0

Where can I find a comprehensive and reliable resource for protein family annotations based on a gene name? I’ve explored Pfam / InterPro, but the annotations seem inconsistent or incomplete. Are there other tools or databases that provide more comprehensive or reliable annotations?

27.11.2024 18:17 👍 1 🔁 0 💬 0 📌 0

I tried looking for it but wasn’t able to find it. Thank you!

13.11.2024 23:32 👍 0 🔁 0 💬 0 📌 0

Nice to meet you Pat!

13.11.2024 23:29 👍 1 🔁 0 💬 0 📌 0

Any people here into bio x ML? #multiomics #AI #ML #genomics #proteomics #drugdiscovery

13.11.2024 22:36 👍 11 🔁 0 💬 1 📌 0

Aaron Wenteler

Latest posts by Aaron Wenteler @aaronw3r