Ever wonder why our HLA specified cancer therapies are only for HLA02:01 thus far? @possuhuanglab.bsky.social presents the scope of the problem at the inaugural @stanford-cancer.bsky.social AI and Cancer Research Symposium π§¬
Ever wonder why our HLA specified cancer therapies are only for HLA02:01 thus far? @possuhuanglab.bsky.social presents the scope of the problem at the inaugural @stanford-cancer.bsky.social AI and Cancer Research Symposium π§¬
Read our preprint here:Β www.biorxiv.org/content/10.1...Β (8/8)
Work done by Yilin Chen, @tianyu.bsky.social , Cizhang Zhao and @hkws.bsky.social . Thank you all! (7/8)
SLAE projects all-atom structures onto a smooth manifold! Unguided linear interpolation between conformations in SLAE latent space decodes to coherent intermediates structures. (6/8)
SLAE extends our generative coverage assessment SHAPES to all-atom, per-residue-type granularity. Now we can compare de novo all-atom protein design models and spot residue-level environment biases. (5/8)
Rich in atomic-environment signal, SLAE features outperform PLMs and task-specific models across diverse, challenging downstream tasks,Β including binding affinity, thermostability and chemical shift prediction.Β All-atom structure pretraining is all you need! (4/8)
The SLAE latent landscape is organized in meaningful ways beyond amino acid identity. It separates residue embeddings along features including solvent accessibility, secondary structure and structural nativeness. (3/8)
We design a deliberately hard two-part task to learn compact, expressive features: a local graph encoder projects each residueβs atomic interactions into a feature vector, while a global decoder learns to compose these local environment tokens into coherent macromolecules. (2/8)
Introducing SLAE, our new framework to represent all-atom protein structures with residue local chemical environment tokens!
SLAE reasons over atomic interactions to recover structures and residue pairwise energetics, yielding a generalizable, physics-informed latent space. (1/8)
π» Sampling and training code for Protpardelle-1c is now available: github.com/ProteinDesig...
Feedback and requests are welcome!
Code will be released soon on our GitHub: github.com/ProteinDesig...
Preprint: www.biorxiv.org/content/10.1...
Have fun sampling and training!
Our new set of all-atom models can sample plausible sidechains without stage-2 sampling. Sequence-dependent partial diffusion behavior occurs when we mask the dummy atoms.
We achieve competitive results on MotifBench and the RFdiffusion/La-Proteina motif scaffolding benchmarks with both backbone-only and all-atom models, proposing scaffolds to previously unsolved problems.
We have a new collection of protein structure generative models which we call Protpardelle-1c. It builds on the original Protpardelle and is tailored for conditional generation: motif scaffolding and binder generation.
Paper: authors.elsevier.com/a/1lWEe8YyDf...
We include some additional analysis in the supplement, including secondary structure distributions.
SHAPES now published in Cell Systems!
FAMPNN architecture
All-atom fixed backbone protein sequence design with FAMPNN
@richardshuai.bsky.social Talal Widatalla @possuhuanglab.bsky.social @brianhie.bsky.social
www.biorxiv.org/content/10.1...
I'm organizing a Keystone symposium, along with Liz Kellogg and @possuhuanglab.bsky.social, on machine learning and macromolecules. Mar 23-26 in Keystone, Colorado. We have a great lineup and deadlines are coming up soon!
Generative models capture a biased set of protein structure space
Generative models do not capture the full expressivity of PDB structures
Protein structure embeddings reveal undersampled and de novo structure space
A framework for evaluating how well generative models of protein structure match the distribution of natural structures.
@possuhuanglab.bsky.social
www.biorxiv.org/content/10.1...
Preprint: www.biorxiv.org/content/10.1...
Code: github.com/ProteinDesig...
Dataset: zenodo.org/records/1458...
Our supplement has many additional figures of the rasterized protein structure space, stratified by designable and not designable and spatially organized by ESM3 and ProtDomainSegmentor embeddings.
One consequence of unbiased sampling of protein structure space is a higher likelihood of finding TERtiary Motifs (TERMs) which involve complex loops, with implications for functional protein design (see Figure 5 legend for group labels).
Inspired by the FPD metric in EvoDiff for protein sequence distributions, we compute FrΓ©chet distance using protein structure embeddings, also subsetted to designable and non-designable samples (FPD-D and FPD-ND).
New preprint from our group! We propose SHAPES, a set of metrics to quantify the distributional coverage of generative models of protein structures with embeddings at different structural hierarchies and quantify undersampling / extrapolation behaviors.
This is a clever way to use synthetic biology: taking a toxin that overactivate the immune system (superantigen), rationally modify its core components and transform it into a platform of immunotherapy agents.
Congratulations to @haotiandu.bsky.social and @possuhuanglab.bsky.social on the 2 papers!
Checkout out these two bombshell papers from @possuhuanglab.bsky.social @stanfordmedicine.bsky.social, computational design of antigen-specific binders to MHC-I or -II, with applications to next gen targeted therapeutics π€―
Science in 60 Seconds: Haotian Du, PhD student with Possu Huangβs lab, explains her research on creating novel proteins that expands the possibilities for detecting more cancer types.
@possuhuanglab.bsky.social @haotiandu.bsky.social
Incredible work from BioE Professorβs Possu Huang Lab @possuhuanglab.bsky.social @haotiandu.bsky.social - pioneering novel proteins with new structures and functions, unlocking new possibilities for detecting more cancer types. Congratulations!