Pascal Notin's Avatar

Pascal Notin

@pascalnotin

Research in AI for Protein Design @Harvard | Prev. CS PhD @UniofOxford, Maths & Physics @Polytechnique

563
Followers
74
Following
13
Posts
20.11.2024
Joined
Posts Following

Latest posts by Pascal Notin @pascalnotin

Links:
๐Ÿ”— Paper: www.biorxiv.org/content/10.1...
๐Ÿ’ป Code: github.com/MarksLab-Das...
9/9

18.06.2025 19:35 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Congratulations to the entire RNAGym team @rohitarorayyc.bsky.social @murfalo.bsky.social @christianchoe.bsky.social @cshearer.bsky.social Aaron Kollasch, Fiona Qu, Ruben Weitzman, Artem Gazizov, @sarahgurev.bsky.social Erik Xie @deboramarks.bsky.social
8/9

18.06.2025 19:35 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

The moderate performance across all tasks reveals exciting opportunities! Key directions: RNA-specific training data, integrating structure-function relationships, and improving non-canonical base pair prediction. RNAGym provides the standardized foundation for progress.
7/9

18.06.2025 19:35 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

๐ŸŒ€ Tertiary structure: 215 diverse 3D structures from the PDB. NuFold leads monomers (0.393 TM-score), AlphaFold3 dominates complexes (0.381 TM-score). Non-Watson-Crick interactions remain a major challenge for all methods
6/9

18.06.2025 19:35 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 2 ๐Ÿ“Œ 1
Post image

๐Ÿ”— Secondary structure: 901k chemical mapping profiles using DMS & 2A3 reactivity. EternaFold achieves top performance (0.656 F1-score), closely followed by CONTRAfold & Vienna. Traditional thermodynamic methods are still competitive with newer deep learning approaches
5/9

18.06.2025 19:35 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

๐Ÿ”ฌ Fitness prediction: 70 assays across tRNA, ribozymes, aptamers & mRNAs (1M+ mutations total). Evo 2 performs best overall (0.276), but performance varies dramatically by RNA type: RNA-FM excels at tRNA/aptamers while Evo 2 leads mRNA tasks. Lots of room for improvement across the board!
4/9

18.06.2025 19:35 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

RNAGym tackles three essential RNA prediction tasks: ๐Ÿ”ฌ Fitness prediction: How mutations affect RNA function ๐Ÿ”— Secondary structure: Base-pairing patterns ๐ŸŒ€ Tertiary structure: 3D molecular architecture
All evaluated zero-shot to test true generalization!
3/9

18.06.2025 19:35 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Why do we need this? RNA modeling faces major challenges: limited experimental data (<1% of PDB entries), inherently less stable structures than proteins, and evaluation has been scattered across different studies with varying approaches.
2/9

18.06.2025 19:35 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

๐Ÿšจ New paper ๐Ÿšจ RNA modeling just got its own Gym! ๐Ÿ‹๏ธ Introducing RNAGym, large-scale benchmarks for RNA fitness and structure prediction.
๐Ÿงต 1/9

18.06.2025 19:35 ๐Ÿ‘ 40 ๐Ÿ” 16 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1
Post image Post image

End-to-end differentiable homology search for protein fitness prediction.

@yaringal.bsky.social @deboramarks.bsky.social @pascalnotin.bsky.social

arxiv.org/abs/2506.089...

11.06.2025 19:00 ๐Ÿ‘ 32 ๐Ÿ” 9 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

Pascal Notin at #VariantEffect25

21.05.2025 09:27 ๐Ÿ‘ 11 ๐Ÿ” 2 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

But more broadly I wanted to convey in the blog that the two (structure + MSA) are critical for proper functional protein design & effects prediction

08.05.2025 14:25 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Thank you @delalamo.xyz! Understand where you are coming from re: design. For some design setups structure is critical -- here my point was more for a directed evolution setup where you have to select top mutants that go in the next round

08.05.2025 14:24 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Preview
Have We Hit the Scaling Wall for Protein Language Models? Beyond Scaling: What Truly Works in Protein Fitness Prediction

Even simple methods leveraging these 2 modalities significantly outperform billion-parameter sequence-only models. So, what's next? Better retrieval, advanced multimodal approaches, & alignment. Read more: pascalnotin.substack.com/p/have-we-hi... #BioTech #AI #pLMs

08.05.2025 00:29 ๐Ÿ‘ 8 ๐Ÿ” 1 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1

Have we hit a "scaling wall" for protein language models? ๐Ÿค” Our latest ProteinGym v1.3 release suggests that for zero-shot fitness prediction, simply making pLMs bigger isn't better beyond 1-4B parameters. The winning strategy? Combining MSAs & structure in multimodal models!

08.05.2025 00:29 ๐Ÿ‘ 25 ๐Ÿ” 7 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 2

Large-scale discovery, analysis, and design of protein energy landscapes https://www.biorxiv.org/content/10.1101/2025.03.20.644235v1

25.03.2025 14:47 ๐Ÿ‘ 10 ๐Ÿ” 8 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 1