Pranam Chatterjee (@pranam)

Could we accelerate the discovery of the next GLP-1R agonist? 🚀 Here, we introduce PepTune, a multi-objective guided discrete diffusion model that generates target-specific peptides, while optimizing their therapeutic properties! 🪐

📜: arxiv.org/abs/2412.17780
💻: huggingface.co/ChatterjeeLa...

24.12.2024 14:35 👍 14 🔁 1 💬 0 📌 0

So excited to host the 2nd GEM Workshop at ICLR 2025! 🎉 We have amazing speakers/panelists 🧑‍🔬, money for new AI+Experiment collabs 🤑, and we're partnering with @naturebiotech.bsky.social to get the best papers into review! 📜 Definitely submit your new work and see you in Singapore!! 🇸🇬

23.12.2024 19:47 👍 7 🔁 2 💬 0 📌 0

So excited to have Christian (@machine.learning.bio) join us at Duke!! 💙 We're building such an amazing AIxBio community with @rohitsingh8080.bsky.social, @alextong.bsky.social, Phil Romero, and others. ESPECIALLY in all things bio-based language models! 💻 🧬 Come join us in Durham! 😈

23.12.2024 19:05 👍 12 🔁 1 💬 0 📌 0

🚨 Current graduate students! If you're interested in developing and leveraging generative language models for therapeutics design, please apply to the
FutureHouse's postdoctoral fellowship and indicate my lab as an option! 😃 $125k salary and access to all of their amazing resources! 🌟

19.12.2024 16:01 👍 4 🔁 1 💬 0 📌 0

This Next Generation IVF Startup Facilitated The Birth Of A Baby For The First Time Doctors say IVF technology developed by Gameto, cofounded by Under 30 alumna Dina Radenkovic, has serious potential. Now it’s finally coming to market.

Surreal! 🤩 With co-founders Martin and Dina, we started Gameto in 2020 with just a silly graph theory algorithm I developed to predict TFs that could differentiate ovarian cells. 💻➡️🧫 Now, little Mia is here with the tech that has grown out of that work. 🐣 So proud!! 🥰
www.forbes.com/sites/alexyo...

16.12.2024 21:22 👍 19 🔁 3 💬 3 📌 1

Any AIxBio folks at NeurIPS and want to meet up with me and the lab? So many of our best collaborations have come from meetings at NeurIPS, ICML, and ICLR!! 🌟

12.12.2024 01:14 👍 7 🔁 1 💬 0 📌 0

We are so grateful to #EndAxD for funding our research leveraging generative language models to design peptide-guided degraders of dysregulated GFAP! 🙏 Please share and consider giving to this wonderful, grassroots organization. 💫 endaxd.org

#EndAxD Instagram Post: www.instagram.com/p/DC7sV2GPst...

03.12.2024 17:51 👍 4 🔁 0 💬 0 📌 0

Yes, definitely. A learned tokenizer is always more complex. The nice thing about ESM-2 is that it's a per-residue tokenization, and doesn't use BPE, SentencePiece, or some other irrelevant tokenizer. It allows us to get good residue-level embeddings. :)

02.12.2024 03:33 👍 1 🔁 0 💬 0 📌 0

I worry that during pre-training, the token embeddings ended up having quite expressive representations themselves. Using a special token would work, but you would need to really contextualize their token representations, just as the <mask> had. Otherwise, I could imagine a dropoff in performance.

02.12.2024 03:23 👍 0 🔁 0 💬 1 📌 0

GitHub - pengzhangzhi/faesm: FAESM: A Drop-in Efficient Pytorch Implementation of ESM FAESM: A Drop-in Efficient Pytorch Implementation of ESM - pengzhangzhi/faesm

Try out Fred's (my PhD student) reimplementation ESM2 with FlashAttention, achieving up to 60% memory savings and 70% faster inference! 🚀 No need to change your ESM code — it’s API-compatible! github.com/pengzhangzhi...

01.12.2024 20:04 👍 42 🔁 9 💬 1 📌 2

Yes we run most of the inference pipelines on A100s and H100s. Haven’t had a problem — A6000s have been fine as well.

24.11.2024 23:04 👍 1 🔁 0 💬 0 📌 0

Ooh such a good idea!! I’ll try it! :)

23.11.2024 21:40 👍 1 🔁 0 💬 0 📌 0

Great points! I actually never liked it either and most of the time, it’s hard to effectively debug with everyone watching. 😅

23.11.2024 19:39 👍 1 🔁 0 💬 1 📌 0

Alright new BlueSky friends, need some advice! 💡 I’m teaching my Generative Models (pLMs, graph models, diffusion, etc.) class at Duke next semester, and want to mix it up! Question: should I do theory on the board ✏️+ live coding 🧑🏾‍💻, or pre-prepared slides 🖥️ with annotated code snippets?

23.11.2024 18:43 👍 9 🔁 1 💬 3 📌 0

Of course!! Will do! The biggest test will be when we down select generated molecules based on Boltz-1 metrics and we’ll see if they work in the wet lab. 🧫

21.11.2024 01:03 👍 1 🔁 0 💬 1 📌 0

Accurate de novo design of high-affinity protein binding macrocycles using deep learning The development of macrocyclic binders to therapeutic proteins has typically relied on large-scale screening methods that are resource-intensive and provide little control over binding mode. Despite c...

New RFDiffusion-for-peptide (RFpeptide) paper from @gauravbhardwaj.bsky.social and team at @uwproteindesign.bsky.social! 🌟 Beautiful binding data on 4 highly-structured targets (pLDDT > 90)! 🙌🏾 Not too confident this would work on highly disordered targets, though. 🤔

www.biorxiv.org/content/10.1...

20.11.2024 13:03 👍 10 🔁 2 💬 1 📌 0

Yeah same. The ByteDance one, Proteinix is quite good and the engineering from them is always clean!

19.11.2024 13:05 👍 1 🔁 0 💬 1 📌 0

Yeah nothing easy about it! And the throughput is low that it’s hard to get a good look at hit rate of the algorithms without doing a mini display assay. Ahh such is life! 😅

19.11.2024 11:25 👍 0 🔁 0 💬 1 📌 0

We usually do some hacky ELISAs via biotinylation of the analyte and then SPR the best ones. A horridly cumbersome set of experiments. 😣

19.11.2024 11:19 👍 0 🔁 0 💬 1 📌 0

Ugh so true!! And as a lab that does peptides, why is it so slow and expensive to synthesize an 18mer is insanity. 🤦🏾‍♂️ Only alternative is to His-tag purify, which also sucks. And don’t get me started with Kd analysis…still no reliable high-throughput binding affinity measurement. 😣

19.11.2024 11:13 👍 0 🔁 0 💬 2 📌 0

Agreed!! We’re using the AF3 models to validate our language model-based binder designs to structured targets (and metals, DNA, etc) prior to experimental testing, as a sort of a hint on performance. But of course, the true test is in the lab for us!! 🧫

19.11.2024 11:04 👍 1 🔁 0 💬 1 📌 0

I’m curious to see how all of the new AF3 mimics perform. 🧐 My lab’s been installing them on our servers, and faster inference and ease-of-use are key for us. Boltz-1 has an early lead, but nothing beats a good frozen pLM with a structure trunk! 😅 Bc accuracy to the PDB isn’t the best metric. 🤷🏾‍♂️

19.11.2024 10:58 👍 48 🔁 4 💬 3 📌 2

Programmable protein degraders enable selective knockdown of pathogenic β-catenin subpopulations in vitro and in vivo Aberrant activation of Wnt signaling results in unregulated accumulation of cytosolic β-catenin, which subsequently enters the nucleus and promotes transcription of genes that contribute to cellular p...

Hi new followers! 🥰You may know me from Twitter as the sequence-first, pLM guy — hope you will continue to follow my lab’s work! 🥹 While you’re here, check out my lab’s new preprint on delivering pLM-generated degraders via LNPs to degrade cytosolic β-catenin in vivo! www.biorxiv.org/content/10.1...

12.11.2024 12:33 👍 11 🔁 2 💬 0 📌 0

A strategy that seems to be useful is using heterodimeric PDBs of single proteins and cutting interfaces — there’s a bit more conformational flexibility captured, and our LMs have done better with this noisier data.

31.12.2023 12:29 👍 0 🔁 0 💬 0 📌 0

We’ve worked to create a similar dataset with minimal leakage, but to do interface prediction from pLM residue embeddings. It’s super tough and we’ve yet to find a good train/test cluster-based split that would achieve this.

31.12.2023 12:27 👍 1 🔁 0 💬 0 📌 0

Which paper is this from? I'm not certain the latent spaces are compatible here to create useful protein representations.

28.11.2023 19:28 👍 0 🔁 0 💬 1 📌 0

SaLT&PepPr is an interface-predicting language model for designing peptide-guided protein degraders ... SaLT&PepPr is a protein language model that isolates peptidic motifs from the binding interfaces of target-interacting partner sequences. These peptides are fused to an E3 ligase domain to generat...

SaLT&PepPr is published in
Communications Biology! Here, we fine-tune the ESM-2 pLM to identify peptidic binding sites on target-interacting partner sequences. We fuse these "guide" peptides to E3 ubiquitin ligases to degrade disease-causing proteins! Take a read! :) www.nature.com/articles/s42...

24.10.2023 17:34 👍 4 🔁 0 💬 0 📌 0

PepMLM: Target Sequence-Conditioned Generation of Peptide Binders... Target proteins that lack accessible binding pockets and conformational stability have posed increasing challenges for drug development. Induced proximity strategies, such as PROTACs and molecular...

Happy to share our early work on generating binding peptides conditioned ONLY on the target sequence! 🌟 PepMLM masks cognate peptides at the end of target protein sequences, and tasks ESM-2 to fully reconstruct the binder region. 😷 arxiv.org/abs/2310.03842

09.10.2023 11:30 👍 5 🔁 0 💬 0 📌 1

“PepMLM: Target Sequence-Conditioned Generation of Peptide Binders via Masked Language Modeling” 🧶🧬

Fine-tunes ESM-2 network to achieve “target-conditioned de novo binder design from sequence alone”

arxiv.org/abs/2310.03842
huggingface.co/TianlaiChen/...

09.10.2023 07:42 👍 6 🔁 1 💬 0 📌 0

Pranam Chatterjee

Latest posts by Pranam Chatterjee @pranam