Could we accelerate the discovery of the next GLP-1R agonist? π Here, we introduce PepTune, a multi-objective guided discrete diffusion model that generates target-specific peptides, while optimizing their therapeutic properties! πͺ
π: arxiv.org/abs/2412.17780
π»: huggingface.co/ChatterjeeLa...
24.12.2024 14:35
π 14
π 1
π¬ 0
π 0
So excited to host the 2nd GEM Workshop at ICLR 2025! π We have amazing speakers/panelists π§βπ¬, money for new AI+Experiment collabs π€, and we're partnering with @naturebiotech.bsky.social to get the best papers into review! π Definitely submit your new work and see you in Singapore!! πΈπ¬
23.12.2024 19:47
π 7
π 2
π¬ 0
π 0
So excited to have Christian (@machine.learning.bio) join us at Duke!! π We're building such an amazing AIxBio community with @rohitsingh8080.bsky.social, @alextong.bsky.social, Phil Romero, and others. ESPECIALLY in all things bio-based language models! π» 𧬠Come join us in Durham! π
23.12.2024 19:05
π 12
π 1
π¬ 0
π 0
π¨ Current graduate students! If you're interested in developing and leveraging generative language models for therapeutics design, please apply to the
FutureHouse's postdoctoral fellowship and indicate my lab as an option! π $125k salary and access to all of their amazing resources! π
19.12.2024 16:01
π 4
π 1
π¬ 0
π 0
This Next Generation IVF Startup Facilitated The Birth Of A Baby For The First Time
Doctors say IVF technology developed by Gameto, cofounded by Under 30 alumna Dina Radenkovic, has serious potential. Now itβs finally coming to market.
Surreal! π€© With co-founders Martin and Dina, we started Gameto in 2020 with just a silly graph theory algorithm I developed to predict TFs that could differentiate ovarian cells. π»β‘οΈπ§« Now, little Mia is here with the tech that has grown out of that work. π£ So proud!! π₯°
www.forbes.com/sites/alexyo...
16.12.2024 21:22
π 19
π 3
π¬ 3
π 1
Any AIxBio folks at NeurIPS and want to meet up with me and the lab? So many of our best collaborations have come from meetings at NeurIPS, ICML, and ICLR!! π
12.12.2024 01:14
π 7
π 1
π¬ 0
π 0
We are so grateful to #EndAxD for funding our research leveraging generative language models to design peptide-guided degraders of dysregulated GFAP! π Please share and consider giving to this wonderful, grassroots organization. π« endaxd.org
#EndAxD Instagram Post: www.instagram.com/p/DC7sV2GPst...
03.12.2024 17:51
π 4
π 0
π¬ 0
π 0
Yes, definitely. A learned tokenizer is always more complex. The nice thing about ESM-2 is that it's a per-residue tokenization, and doesn't use BPE, SentencePiece, or some other irrelevant tokenizer. It allows us to get good residue-level embeddings. :)
02.12.2024 03:33
π 1
π 0
π¬ 0
π 0
I worry that during pre-training, the token embeddings ended up having quite expressive representations themselves. Using a special token would work, but you would need to really contextualize their token representations, just as the <mask> had. Otherwise, I could imagine a dropoff in performance.
02.12.2024 03:23
π 0
π 0
π¬ 1
π 0
GitHub - pengzhangzhi/faesm: FAESM: A Drop-in Efficient Pytorch Implementation of ESM
FAESM: A Drop-in Efficient Pytorch Implementation of ESM - pengzhangzhi/faesm
Try out Fred's (my PhD student) reimplementation ESM2 with FlashAttention, achieving up to 60% memory savings and 70% faster inference! π No need to change your ESM code β itβs API-compatible! github.com/pengzhangzhi...
01.12.2024 20:04
π 42
π 9
π¬ 1
π 2
Yes we run most of the inference pipelines on A100s and H100s. Havenβt had a problem β A6000s have been fine as well.
24.11.2024 23:04
π 1
π 0
π¬ 0
π 0
Ooh such a good idea!! Iβll try it! :)
23.11.2024 21:40
π 1
π 0
π¬ 0
π 0
Great points! I actually never liked it either and most of the time, itβs hard to effectively debug with everyone watching. π
23.11.2024 19:39
π 1
π 0
π¬ 1
π 0
Alright new BlueSky friends, need some advice! π‘ Iβm teaching my Generative Models (pLMs, graph models, diffusion, etc.) class at Duke next semester, and want to mix it up! Question: should I do theory on the board βοΈ+ live coding π§πΎβπ», or pre-prepared slides π₯οΈ with annotated code snippets?
23.11.2024 18:43
π 9
π 1
π¬ 3
π 0
Of course!! Will do! The biggest test will be when we down select generated molecules based on Boltz-1 metrics and weβll see if they work in the wet lab. π§«
21.11.2024 01:03
π 1
π 0
π¬ 1
π 0
Accurate de novo design of high-affinity protein binding macrocycles using deep learning
The development of macrocyclic binders to therapeutic proteins has typically relied on large-scale screening methods that are resource-intensive and provide little control over binding mode. Despite c...
New RFDiffusion-for-peptide (RFpeptide) paper from @gauravbhardwaj.bsky.social and team at @uwproteindesign.bsky.social! π Beautiful binding data on 4 highly-structured targets (pLDDT > 90)! ππΎ Not too confident this would work on highly disordered targets, though. π€
www.biorxiv.org/content/10.1...
20.11.2024 13:03
π 10
π 2
π¬ 1
π 0
Yeah same. The ByteDance one, Proteinix is quite good and the engineering from them is always clean!
19.11.2024 13:05
π 1
π 0
π¬ 1
π 0
Yeah nothing easy about it! And the throughput is low that itβs hard to get a good look at hit rate of the algorithms without doing a mini display assay. Ahh such is life! π
19.11.2024 11:25
π 0
π 0
π¬ 1
π 0
We usually do some hacky ELISAs via biotinylation of the analyte and then SPR the best ones. A horridly cumbersome set of experiments. π£
19.11.2024 11:19
π 0
π 0
π¬ 1
π 0
Ugh so true!! And as a lab that does peptides, why is it so slow and expensive to synthesize an 18mer is insanity. π€¦πΎββοΈ Only alternative is to His-tag purify, which also sucks. And donβt get me started with Kd analysisβ¦still no reliable high-throughput binding affinity measurement. π£
19.11.2024 11:13
π 0
π 0
π¬ 2
π 0
Agreed!! Weβre using the AF3 models to validate our language model-based binder designs to structured targets (and metals, DNA, etc) prior to experimental testing, as a sort of a hint on performance. But of course, the true test is in the lab for us!! π§«
19.11.2024 11:04
π 1
π 0
π¬ 1
π 0
Iβm curious to see how all of the new AF3 mimics perform. π§ My labβs been installing them on our servers, and faster inference and ease-of-use are key for us. Boltz-1 has an early lead, but nothing beats a good frozen pLM with a structure trunk! π
Bc accuracy to the PDB isnβt the best metric. π€·πΎββοΈ
19.11.2024 10:58
π 48
π 4
π¬ 3
π 2
A strategy that seems to be useful is using heterodimeric PDBs of single proteins and cutting interfaces β thereβs a bit more conformational flexibility captured, and our LMs have done better with this noisier data.
31.12.2023 12:29
π 0
π 0
π¬ 0
π 0
Weβve worked to create a similar dataset with minimal leakage, but to do interface prediction from pLM residue embeddings. Itβs super tough and weβve yet to find a good train/test cluster-based split that would achieve this.
31.12.2023 12:27
π 1
π 0
π¬ 0
π 0
Which paper is this from? I'm not certain the latent spaces are compatible here to create useful protein representations.
28.11.2023 19:28
π 0
π 0
π¬ 1
π 0
βPepMLM: Target Sequence-Conditioned Generation of Peptide Binders via Masked Language Modelingβ π§Άπ§¬
Fine-tunes ESM-2 network to achieve βtarget-conditioned de novo binder design from sequence aloneβ
arxiv.org/abs/2310.03842
huggingface.co/TianlaiChen/...
09.10.2023 07:42
π 6
π 1
π¬ 0
π 0