Leo Zang's Avatar

Leo Zang

@leozang

Protein Designer | Share Reading Notes (AI+Protein/RNA/DNA) www.leozang.com

790
Followers
22
Following
54
Posts
14.11.2024
Joined
Posts Following

Latest posts by Leo Zang @leozang

Post image

Computational protein design
- "This Primer provides an introduction to the main approaches in computational protein design, covering both physics-based and machine-learning-based tools. It aims to be accessible to biological, physical and computer scientists alike."
www.nature.com/articles/s43...

05.03.2025 21:14 πŸ‘ 6 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Protein-Based Degraders: From Chemical Biology Tools to Neo-Therapeutics The nascent field of targeted protein degradation (TPD) could revolutionize biomedicine due to the ability of degrader molecules to selectively modulate disease-relevant proteins. A key limitation to ...

We describe existing platforms for protein/peptide-based ligand identification and the drug delivery systems that might be exploited for the delivery of biologic-based degraders."
Link: pubs.acs.org/doi/10.1021/...

30.01.2025 17:33 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Protein-Based Degraders: From Chemical Biology Tools to Neo-Therapeutics
- "we provide a comprehensive and critical review of studies that have used proteins and peptides to mediate the degradation and hence the functional control of otherwise challenging disease-relevant protein targets.

30.01.2025 17:33 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

-- aim to approximate soft optimal denoising processes (a.k.a. policies in RL) that combine pre-trained denoising processes with value functions serving as look-ahead functions that predict from intermediate states to terminal rewards. "

23.01.2025 03:41 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

- "We review these methods from a unified perspective, demonstrating that current techniques -- such as Sequential Monte Carlo (SMC)-based guidance, value-based sampling, and classifier guidance

23.01.2025 03:41 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review
arxiv.org/abs/2501.09685

23.01.2025 03:41 πŸ‘ 5 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

- Construct full-length proteins with binding motifs and refining structures using the Rosetta FastDesign protocol and grafting (with a potential round of LigandMPNN optimization)
- Engineer and validate binders for Bcl2–venetoclax, DB3–progesterone, and PDF1–actinonin through experimental testing

18.01.2025 06:53 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

- Benchmark MaSIF-neosurf against RFAA on 14 ligand-induced PPI complexes with 8,907 decoys from PDBBind
- Use MaSIF-search to predict buried surfaces and identify complementary surface fingerprints from a database of protein fragments (~640,000)

18.01.2025 06:53 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Targeting protein–ligand neosurfaces with a generalizable deep learning tool - Nature A computational deep learning approach is used to design synthetic proteins that target the neosurfaces formed by protein–ligand interactions, with applications in the development of new therapeutic m...

Targeting protein–ligand neosurfaces with a generalizable deep learning tool | @Nature
- MaSIF-neosurf can design binders for protein-ligand complexes, targeting neosurfaces (i.e., ligand-induced structural changes on the protein surface)
Link: www.nature.com/articles/s41...

18.01.2025 06:53 πŸ‘ 13 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0

- Train sequence based models to predict the activity of regulatory elements (MPRALegNet, MPRAnn, EnformerMPRA, and SeiMPRA)
- Use MPRALegNet predicts TFBS combinations, fine-mapping and variant effects

17.01.2025 07:12 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Massively parallel characterization of transcriptional regulatory elements
- Develope an optimized lentiMPRA (lentiviral massively parallel reporter assay) method to test regulatory activity of >680,000 sequences across three cell types (HepG2, K562, WTC11)
Link: www.nature.com/articles/s41...

17.01.2025 07:12 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Validate User

Integrating genetic algorithms and language models for enhanced enzyme design
academic.oup.com/bib/article/...

09.01.2025 21:58 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

DNALONGBENCH: A Benchmark Suite for Long-Range DNA Prediction Tasks
www.biorxiv.org/content/10.1...
Engineering of CRISPR-Cas PAM recognition using deep learning of vast evolutionary data
www.biorxiv.org/content/10.1...

09.01.2025 21:58 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

- "This review systematically summarizes recent advances in chromatin interaction matrix prediction models...This article details various models, focusing on how one-dimensional (1D) information transforms into the 3D structure chromatin interactions"

27.12.2024 05:05 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

A review of deep learning models for the prediction of chromatin interactions with DNA and epigenomic profiles | @BriefingBioinfo
Link: academic.oup.com/bib/article/...

27.12.2024 05:05 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

EnzymeCAGE: A Geometric Foundation Model for Enzyme Retrieval with Evolutionary Insights
www.biorxiv.org/content/10.1...
Semantic mining of functional de novo genes from a genomic language model
www.biorxiv.org/content/10.1...

19.12.2024 01:42 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Bridging Sequence-Structure Alignment in RNA Foundation Models
arxiv.org/abs/2407.11242
Mapping targetable sites on the human surfaceome for the design of novel binders
www.biorxiv.org/content/10.1...

19.12.2024 01:42 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

NeuralPLexer3: Physio-Realistic Biomolecular Complex Structure Prediction with Flow Models
arxiv.org/abs/2412.10743
FlowDock: Geometric Flow Matching for Generative Protein-Ligand Docking and Affinity Prediction
arxiv.org/abs/2412.10966

19.12.2024 01:42 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Leveraging ancestral sequence reconstruction for protein representation learning
www.nature.com/articles/s42...
Guiding Generative Protein Language Models with Reinforcement Learning
arxiv.org/abs/2412.12979

19.12.2024 01:42 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Harnessing the biology of regulatory T cells to treat disease - Nature Reviews Drug Discovery Regulatory T cells keep the immune system in check to maintain homeostasis and restrain inflammation. This Review discusses strategies to harness these cells therapeutically for autoimmunity, transpla...

Harnessing the biology of regulatory T cells to treat disease
- "This Review will discuss recent advances in our understanding of human Treg cell biology, with a focus on mechanisms of action and strategies to assess outcomes of Treg cell-targeted therapies."
www.nature.com/articles/s41...

16.12.2024 19:32 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

IgDesign: In vitro validated antibody design against multiple therapeutic antigens using inverse folding
www.biorxiv.org/content/10.1...

16.12.2024 03:04 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Annotation-guided Protein Design with Multi-Level Domain Alignment
arxiv.org/abs/2404.16866
BEACON: Benchmark for Comprehensive RNA Tasks and Language Models
arxiv.org/abs/2406.10391

16.12.2024 03:04 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
mRNA m6A detection - Nature Reviews Methods Primers N6-methyladenosine (m6A) is an mRNA modification influencing gene expression. Advanced methodologies for mapping m6A enhance understanding of its dynamic roles and interactions. In this Primer, Moshit...

mRNA m6A detection | @MethodsPrimers
- "This Primer outlines the available tools for detecting and mapping m6A, discusses the strengths and limitations of each method and offers guidance on selecting the most suitable approach."
www.nature.com/articles/s43...

15.12.2024 19:54 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

- Use gradient-based approximation to modify protein sequences to increase/decrease specific concept values (e.g., which amino acids for increasing aromaticity).

14.12.2024 22:29 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

- Train model with MLM Loss, Concept Loss (mean square error on concept embedding), and Orthogonality Loss (cosine similarity between known/unknown embeddings).

14.12.2024 22:29 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

- Add Concept Bottleneck Module (using <cls> token) and Orthogonality Network to standard BERT-like architecture.

14.12.2024 22:29 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Concept Bottleneck Language Models For protein design
- Introduce CB-pLM (Concept Bottleneck Protein Language Models) from 24M to 3B, trained on UniRef50 and SwissProt over 718 concepts (including Cluster name, Biological process, and Biopython-derived features, etc.)
arxiv.org/abs/2411.06090

14.12.2024 22:29 πŸ‘ 5 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Preview
Benchmarking recent computational tools for DNA-binding protein identification Abstract. Identification of DNA-binding proteins (DBPs) is a crucial task in genome annotation, as it aids in understanding gene regulation, DNA replicatio

Benchmarking recent computational tools for DNA-binding protein identification
- "we conduct an unbiased benchmarking of 11 state-of-the-art computational tools as well as traditional tools such as ScanProsite, BLAST, and HMMER for identifying DBPs."
Link: academic.oup.com/bib/article/...

12.12.2024 04:14 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Title correction:

A general temperature-guided language model to design proteins of enhanced stability and activity

27.11.2024 22:59 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

- Mouse level: Human-homologous protein data sourced from OGEE database
- Cell line level: Protein essentiality data from Project Score database, providing insights across 323 different human cell lines

27.11.2024 22:54 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0