Home New Trending Search
About Privacy Terms
Posts
Neil Thomas's posts

If you're otherwise at the conference and would like to meet up, shoot me a DM!

3 months ago 1 0 0 0

Excited about AI for Biology? At NeurIPS? Come say hi to the Biohub (née EvolutionaryScale) folks in the exhibit hall!

Keep me company:
Tue 12-3
Wed 3-6

Or stop by any time for a chance to meet my incredible colleagues: roshanrao.bsky.social, ebetica.bsky.social, rsmolina.bsky.social et al.

3 months ago 3 1 1 0

Rarely does a play make you apply to grad school. RIP

3 months ago 1 0 0 0
Post image

This October I’m drawing one molecule a day inspired by proteins @rcsb.bsky.social

Day 1/31
Prompt MUSTACHE
Pdb 2QZI

Let’s start with something fun:
Mr. Potato head’s ‘stache is made of Androgen Receptor that binds testosterone and helps maintain his male phenotype

Next prompt: WEAVE
suggestions?

5 months ago 28 8 2 0
Post image

We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.
www.biorxiv.org/content/10.1...
(1/n)

5 months ago 174 91 4 5

slides remain the hardest modality

6 months ago 5 0 0 0
https://authors.elsevier.com/a/1lbX08YyDfuZWX

Antibodies are highly diverse, but most possible sequences are unstable or polyreactive. In this work, just published in Cell Syst., we propose a new source of data for modeling constraints from these properties. Our models show clear improvements in predicting Ab dysfunction. (1/n)
t.co/qCZERPUMPF

6 months ago 16 6 1 0

Sign up for The Tournament!

🌍 Design a PETase - real-world impact on bioremediation!
🧬 Sponsored DNA synthesis and functional screening - no need for a lab!
🤖 Sponsored ESM inference through @evolutionaryscale.bsky.social Forge - if GPUs are a barrier!
🏆 Winners get published and win up to $15K!

7 months ago 2 0 0 0
Post image

Why PETase for our tournament? In 2024, the world made about 30 million tonnes of PET plastic, most from fossil fuels.

PETase can degrade PET, but isn’t ready for industrial-scale waste. The challenge: design an improved variant that can change that.

Register by Oct 17 alignbio.org/protein-engi...

7 months ago 7 2 0 1
Post image Post image Post image Post image

A benchmark dataset of 614 experimentally characterized de novo designed monomers from 11 different design studies shows that:
- deep learning structural metrics only weakly predict success
- The score distribution is different for different types of structures

@grocklin.bsky.social

7 months ago 39 10 1 0

With Tom Lehrer's passing, I suppose this is a moment to share the story of the prank he played on the National Security Agency, and how it went undiscovered for nearly 60 years.

7 months ago 8,659 3,614 143 717

Stats friends... what would your estimator be if you were interested in a similar question as this study that is lighting Bluesky on fire tonight? 1/x

8 months ago 21 1 5 1
Post image

1/4
🚀 Announcing the 2025 Protein Engineering Tournament.

This year’s challenge: design PETase enzymes, which degrade the type of plastic in bottles. Can AI-guided protein design help solve the climate crisis? Let’s find out! ⬇️

#AIforBiology #ClimateTech #ProteinEngineering #OpenScience

8 months ago 23 20 1 4

We're sponsoring the use of ESM3 and EMSC to help researchers engineer improved PETase enzymes in the @AlignBio 2025 Protein Engineering Tournament.

Get started using ESMC to predict protein function and ESM3 to generate new enzymes here: github.com/evolutionary...

8 months ago 9 3 0 1
Video thumbnail

Today I remembered my first QM parameterization of a small molecule failed miserably (turn volume ON for a full experience)

11 months ago 29 4 3 1

NIH funding supporting the HMMER and Infernal software projects has been terminated. NIH states that our work, as well as all other federally funded research at Harvard, is of no benefit to the US.

9 months ago 286 231 37 47
Preview
Learning millisecond protein dynamics from what is missing in NMR spectra Many proteins’ biological functions rely on interconversions between multiple conformations occurring at micro-to millisecond (µs-ms) timescales. A lack of standardized, large-scale experimental data ...

Next Tues (4/29) at **4:30PM** ET, we will have @ginaelnesr.bsky.social @hkws.bsky.social present "Learning millisecond protein dynamics from what is missing in NMR spectra"

Paper: biorxiv.org/content/10.1...

Sign up on our website for zoom links!

10 months ago 19 11 0 2
Post image

Thrilled to see my digital art on the cover of Trends Genet. The two binary strings represent reverse-complementary DNA sequences (00=A, 01=C, 10=G, 11=T) and the connecting rectangles represent “embeddings” learned by DNA language models. Pls check out our article as well: doi.org/10.1016/j.ti...

11 months ago 69 13 0 1
Post image

Small proteins can be more complex than they look!

We know proteins fluctuate between different conformations- but by how much? How does it vary from protein to protein? Can highly stable domains have low stability segments? @ajrferrari.bsky.social experimentally tested >5,000 domains to find out!

11 months ago 86 36 4 0

Gene synthesis is often the most expensive part of protein engineering with generative models.

Happy to have played a small part in this work, where Chase developed a method for precision library construction at scale, with per-gene costs as low as $1.50.

@philromero.bsky.social

11 months ago 63 23 1 0
Preview
Scalable and cost-efficient custom gene library assembly from oligopools Advances in metagenomics, deep learning, and generative protein design have enabled broad in silico exploration of sequence space, but experimental characterization is still constrained by the cost an...

🎉Congrats to Chase on her new preprint! She developed OMEGA--a simple method for assembling custom gene panels for as little as $1.50 per gene. Big step forward protein engineering and design!🧬
www.biorxiv.org/content/10.1...

11 months ago 57 14 2 3

So exciting to think what we will be able to do as we pair scaled library assembly techniques like these with ML-designed libraries and high throughput screening!

11 months ago 4 0 0 0
Post image

Protein dynamics was the first research to enchant me >10yrs ago, but I left in PhD bc I couldn't find big experimental data to evaluate models.

Today w @ginaelnesr.bsky.social, I'm thrilled to share the big dynamics data I've been dreaming of, and the mdl we trained w them: Dyna-1.
📝: rb.gy/de5axp

11 months ago 85 25 2 2
Post image

Protein function often depends on protein dynamics. To design proteins that function like natural ones, how do we predict their dynamics?

@hkws.bsky.social and I are thrilled to share the first big, experimental datasets on protein dynamics and our new model: Dyna-1!

🧵

11 months ago 104 38 6 5
Preview
GitHub - google-deepmind/nuclease_design: ML-guided enzyme engineering ML-guided enzyme engineering. Contribute to google-deepmind/nuclease_design development by creating an account on GitHub.

All of our data is available! We released a deeply sampled, 55k variant library of NucB’s enzymatic function.

Get started here: github.com/google-deepm...

1 year ago 1 0 0 0
Engineering highly active and diverse nuclease enzymes by ML and high-throughput screening
Engineering highly active and diverse nuclease enzymes by ML and high-throughput screening YouTube video by ML for protein engineering seminar series

If you’re interested in learning more, check out our @ml4proteins.bsky.social seminar talk on this work

www.youtube.com/watch?v=eGNE...

1 year ago 1 0 1 0

This was a collaborative effort between myself, David Belanger, Lucy Colwell and our whole team: Chenling Xu, Hanson Lee, Kathleen Hirano, Kosuke Iwai, Vanja Polic, Kendra Nyberg, Kevin Hoff, Lucas Frenz, Charlie Emrich, Jun Kim, Mariya Chavarha, Abi Ramanan, Jeremy Agresti

1 year ago 0 0 1 0

This campaign was completed in 2021! Since then, the field has evolved tremendously. We’re excited about work that pushes forward:
1) Multi-objective optimization
2) Generative models (e.g. ESM3, ProGen, RFDiffusion)
3) Synergy with randomized library design
… to name a few

1 year ago 1 0 1 0
Post image

Multiple Sequence Alignments (MSAs) were also powerful for zero-shot design! Without any assay data, and even without structure or large-scale pretraining, we were able to design improved NucB variants with as many as 9 mutations from the wildtype.

1 year ago 1 0 1 0
Post image

We found that in a head-to-head comparison of ML-guided design versus high-throughput directed evolution, our ML system could design higher activity variants, with lots more diversity!

1 year ago 1 0 1 0
Neil Thomas
Neil Thomas
@countablyfinite
887 Followers 476 Following 18 Posts
Posts Following