Milot Mirdita's Avatar

Milot Mirdita

@milot

Open source #bioinformatics at Sungkyunkwan University πŸ‡°πŸ‡· | former Steinegger Lab @ SNU, SΓΆding Lab @ MPI-NAT | http://mstdn.science/@milotmirdita

2,433
Followers
811
Following
62
Posts
11.07.2023
Joined
Posts Following

Latest posts by Milot Mirdita @milot

Post image

ProteinTTT is now easy to run on Hugging Face Spaces and Google Colab. We’ll also be presenting the paper at ICLR 2026 πŸ‡§πŸ‡·
πŸ€— Hugging Face Space: huggingface.co/spaces/pimen...
βš™οΈ Google Colab: colab.research.google.com/drive/1l_h7c...
πŸ§΅πŸ‘‡

05.03.2026 12:08 πŸ‘ 39 πŸ” 9 πŸ’¬ 3 πŸ“Œ 0
Two-panel calibration plot (two benchmark dimer datasets) comparing predicted interchain contact-probability bins (x-axis) with the observed fraction of native interfacial contacts (y-axis). Points follow the diagonal, indicating close agreement between predicted probabilities and true interface-contact fractions.

Two-panel calibration plot (two benchmark dimer datasets) comparing predicted interchain contact-probability bins (x-axis) with the observed fraction of native interfacial contacts (y-axis). Points follow the diagonal, indicating close agreement between predicted probabilities and true interface-contact fractions.

My first manuscript in MPI colours! With @tothpetroczylab.bsky.social, we show that AlphaFold PAE-derived contact probabilities are well calibrated to the fraction of true interface contacts across experimentally determined protein dimers.

www.biorxiv.org/content/10.6...

04.03.2026 08:45 πŸ‘ 17 πŸ” 7 πŸ’¬ 1 πŸ“Œ 1
Preview
Release SeqKit v2.13.0 (10-year-old birthday version) Β· shenwei356/seqkit Changelog SeqKit is 10 years old! SeqKit v2.13.0 - 2026-02-28 seqkit: add support for reading and writing LZ4 compression format. new command: seqkit sample2: improved seqkit sample by @stahiga....

Can't wait to release a 10-year-old birthday version for SeqKit!

- 10 years
- 2 papers, 3500 citations
- 20 contributors
- 40 subcommands
- 880 commits
- 500 issues
- 685.5K Bioconda total downloads

Thank you all, dear contributors and users!
I'll keep maintaining it.

github.com/shenwei356/s...

27.02.2026 13:25 πŸ‘ 122 πŸ” 35 πŸ’¬ 6 πŸ“Œ 1
Post image Post image

At the 132nd Internat. Titisee Conference on Biology 2.0: The AI Revolution in Biology & Medicine

From sequenceβ†’function models 🧬
to protein & generative structure models πŸ§ͺ
to AI of cell states & perturbations 🧫

Great science, great friends, beautiful lake. Thanks @BIFonds!

27.02.2026 10:17 πŸ‘ 16 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0

New version of our preprint on bioRxiv about bioRxiv up. Now that’s what I call a revision – 6 years after the first version!
It has new data about our progress and highlights from a massive user survey. 1/n
www.biorxiv.org/content/10.1...

26.02.2026 16:05 πŸ‘ 78 πŸ” 43 πŸ’¬ 1 πŸ“Œ 4

Can we simulate realistic evolutionary trajectories and β€œreplay the tape of life”? In this work, we propose a flexible, generalizable deep learning framework for modeling how the entire protein sequence evolves over time while capturing complex interactions across sites. 1/n
doi.org/10.64898/202...

21.02.2026 17:13 πŸ‘ 83 πŸ” 35 πŸ’¬ 3 πŸ“Œ 1
Preview
Annotating genomes at increased scale and resolution Nature Reviews Genetics - In this Review, Ji et al. overview how rapidly advancing experimental and computational methods are enabling improved and automated annotation of gene structure and...

Our new review on genome annotation just appeared in @naturerevgenet.bsky.social, with a particular focus on the human genome, with Hayden Ji and Mihaela Pertea: rdcu.be/e4mI1

17.02.2026 12:46 πŸ‘ 24 πŸ” 12 πŸ’¬ 0 πŸ“Œ 0
Post image

Introducing The Structural History of Eukarya (SHE): The first proteome-scale phylogeny constructed entirely from 3D structure.
We computed 300 trillion alignments across 1,542 species to map the tree of life. πŸ§΅πŸ‘‡ (1/5)

07.02.2026 08:50 πŸ‘ 84 πŸ” 40 πŸ’¬ 2 πŸ“Œ 0
Compbio Asia

Please spread the word:

We invite applications to a two-week Computational Biology workshop in Singapore, June 14-27.

This NSF-funded workshop brings together 16-20 US grad students with international peers.
Apply by March 21: compbioasia.net
🧡 Details below:

05.02.2026 17:22 πŸ‘ 3 πŸ” 9 πŸ’¬ 2 πŸ“Œ 1

Distance-Restraint-Guided Diffusion Models for Sampling Protein Conformational Changes and Ligand Dissociation Pathways
Tatsuki Hori, Yoshitaka Moriwaki, Ryuichiro Ishitani
www.biorxiv.org/content/10.6...
Our new preprint is out.

02.02.2026 07:52 πŸ‘ 6 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0
Preview
Multiple protein structure alignment at scale with FoldMason Protein structure is conserved beyond sequence, making multiple structural alignment (MSTA) essential for analyzing distantly related proteins. Computational prediction methods have vastly extended ou...

FoldMason is out now in @science.org. It generates accurate multiple structure alignments for thousands of protein structures in seconds. Great work by Cameron L. M. Gilchrist and @milot.bsky.social.
πŸ“„ www.science.org/doi/10.1126/...
🌐 search.foldseek.com/foldmason
πŸ’Ύ github.com/steineggerla...

30.01.2026 06:11 πŸ‘ 300 πŸ” 147 πŸ’¬ 4 πŸ“Œ 3
AmpliPhy improves gene trees by adding homologs without affecting alignments In phylogenomics, gene tree reconstruction depends on multiple sequence alignment (MSA) and tree inference, and ongoing work continues to improve inference quality. Denser taxon sampling has been associated with improved gene tree inference, suggesting that adding homologs could be a practical route to higher accuracy as sequence databases continue to expand. However, adding sequences can influence multiple steps of typical inference pipelines, and little is known on its specific effect on the multiple sequence alignment, tree reconstruction, and rooting steps. We performed a large-scale empirical benchmark to quantify how homolog enrichment affects alignment and phylogenetic inference. Using an enrichment-impoverishment design and a measure of tree accuracy based on taxonomic congruence, we found that enrichment consistently improves tree inference quality, while effects on alignment quality are marginal. We show that this improvement is associated with accurate root placement on enriched trees when sensitive homolog search is accompanied. Notably, much of the benefit can be retained with relatively compact alignments produced by sequence addition. Building on these observations, we provide a tool, AmpliPhy, which efficiently improves phylogenetic reconstruction of protein families through homolog enrichment. The AmpliPhy open-source pipeline software is available at https://github.com/DessimozLab/ampliphy. ### Competing Interest Statement The authors have declared no competing interest. Swiss National Science Foundation, https://ror.org/00yjd3n13, 216623, 10005715

Can ever-increasing sequence databases improve phylogenetic reconstruction of a gene family? Our new preprint introduces AmpliPhy, a pipeline that automates homolog enrichment to improve gene tree inference, built on a robust phylogenomic benchmark scheme. 🧡1/n
πŸ“ƒ doi.org/10.64898/2026.01.26.701724

28.01.2026 06:10 πŸ‘ 25 πŸ” 14 πŸ’¬ 1 πŸ“Œ 0

Milot’s venture into establishing his own lab is incredibly excitinge. I highly recommend to join Milot on his mission to advance molecular biology through open-source bioinformatics.

21.01.2026 03:37 πŸ‘ 36 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0
Preview
Mirdita Lab - Laboratory for Computational Biology & Molecular Machine Learning Mirdita Lab builds scalable bioinformatics methods.

My time in @martinsteinegger.bsky.social's group is ending, but I’m staying in Korea to build a lab at Sungkyunkwan University School of Medicine. If you or someone you know is interested in molecular machine learning and open-source bioinformatics, please reach out. I am hiring!
mirdita.org

20.01.2026 11:07 πŸ‘ 104 πŸ” 55 πŸ’¬ 7 πŸ“Œ 1
Preview
In remembrance of Peer BorkΒ  | EMBL EMBL and its community are deeply saddened by the death of Peer Bork, the organisation’s Interim Director General.

This is very sad news

'It is with great sadness that EMBL announces that Interim Director General Professor Peer Bork passed away from natural causes on 16 January 2026.'

www.embl.org/news/embl-an...

16.01.2026 18:06 πŸ‘ 30 πŸ” 10 πŸ’¬ 3 πŸ“Œ 2

Phold's manuscript is now available @narjournal.bsky.social thanks to @susiegriggo.bsky.social @npbhavya.bsky.social @vijinim.bsky.social @linsalrob.bsky.social @martinsteinegger.bsky.social @milot.bsky.social @eunbelivable.bsky.social & others not on bsky #phagesky academic.oup.com/nar/article/...

14.01.2026 05:10 πŸ‘ 82 πŸ” 44 πŸ’¬ 1 πŸ“Œ 1
Post image

Happy to share that our work on HLp, a bacterial histone from Leptospira perolatii, is now published in Nature Communications πŸŽ‰

In this study, we show that HLp forms stable tetramers that wrap ~60 bp of DNA, revealing a distinct histone–DNA organization in bacteria.

www.nature.com/articles/s41...

13.12.2025 08:09 πŸ‘ 55 πŸ” 16 πŸ’¬ 2 πŸ“Œ 1
Preview
PDBe: enhanced structural data exploration to facilitate discovery Abstract. Protein Data Bank in Europe (PDBe) is a founding member of the worldwide Protein Data Bank (wwPDB), delivering open access to experimentally dete

From Sameer Velankar & colleagues in @narjournal.bsky.social #NARDatabaseIssue | PDBe: enhanced structural data exploration to facilitate discovery | #Bioinformatics #Database #OpenScience #Proteomics #PDB 🧬 πŸ–₯️πŸ§ͺπŸ”“
⬇️
academic.oup.com/nar/advance-...

11.12.2025 15:13 πŸ‘ 7 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0

Today marks one year since the Dec. 3, 2024 martial law declaration that rocked South Korea and still reverberates today. What’s on my mind today is the grit of South Koreans who rushed to the National Assembly that night, in freezing weather, to demand a return to democratic government.

03.12.2025 03:09 πŸ‘ 2165 πŸ” 548 πŸ’¬ 24 πŸ“Œ 21
Post image Post image

We are deeply saddened to learn of the passing of Amos Bairoch. His vision and leadership helped build the foundations of today’s bioinformatics community. From the creation of essential biological databases to decades of mentorship, his influence can be felt across research groups worldwide.

02.12.2025 17:00 πŸ‘ 4 πŸ” 2 πŸ’¬ 1 πŸ“Œ 1

LoL-align: sensitive and fast probabilistic protein structure alignment https://www.biorxiv.org/content/10.1101/2025.11.24.690091v1

26.11.2025 02:46 πŸ‘ 12 πŸ” 7 πŸ’¬ 0 πŸ“Œ 0
Preview
AlphaFold Protein Structure Database 2025: a redesigned interface and updated structural coverage Abstract. The AlphaFold Protein Structure Database (AFDB; https://alphafold.ebi.ac.uk), developed by EMBL–EBI and Google DeepMind, provides open access to

From Sameer Velankar & colleagues in @narjournal.bsky.social #NARDatabaseIssue | #AlphaFold #Protein #Structure #Database 2025: a redesigned interface and updated structural coverage | #Bioinformatics #Proteomics #OpenScience #AFDB πŸ§ͺπŸ”“ CC/ @ebi.embl.org
⬇️
academic.oup.com/nar/advance-...

24.11.2025 00:56 πŸ‘ 30 πŸ” 14 πŸ’¬ 0 πŸ“Œ 0
Post image

A few py2Dmol updates 🧬

py2dmol.solab.org
Integration with AlphaFoldDB (will auto fetch results). Drag and drop results from AF3-server or ColabFold for interactive experience! (1/4)

19.11.2025 08:15 πŸ‘ 104 πŸ” 31 πŸ’¬ 1 πŸ“Œ 0

Congrats Spyro!

15.11.2025 07:10 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Guess the news is officially out! Extremely excited to announce that I will be starting my own laboratory at Institut Pasteur @pasteur.fr this coming spring!

Slight change to my office window view from Tokyo TowerπŸ—Ό to the Tour Eiffel. πŸ‡«πŸ‡·

15.11.2025 06:42 πŸ‘ 112 πŸ” 11 πŸ’¬ 28 πŸ“Œ 0

I want to spell this out in case the implications aren't clear:

This means all public tools/webapps of GISAID data (all the ones you've been used to seeing thru the pandemic, as far as we can tell) are prohibited.

The file allowed this. Cut that - cut off all tools the public & others were using.

07.11.2025 14:41 πŸ‘ 258 πŸ” 136 πŸ’¬ 2 πŸ“Œ 8

OpenFold3-preview (OF3p) is out: a sneak peek of our AF3-based structure prediction model. Our aim for OF3 is full AF3-parity for every modality. We now believe we have a clear path towards this goal and are releasing OF3p to enable building in the OF3 ecosystem. MoreπŸ‘‡

28.10.2025 18:30 πŸ‘ 126 πŸ” 42 πŸ’¬ 1 πŸ“Œ 3
Preview
GitHub - bbuchfink/diamond: Accelerated BLAST compatible local sequence aligner. Accelerated BLAST compatible local sequence aligner. - bbuchfink/diamond

DIAMOND v2.1.15 now supports all taxonomy features for BLAST databases, and support for using BLAST databases has also been added to the Bioconda version github.com/bbuchfink/di...

28.10.2025 16:45 πŸ‘ 16 πŸ” 5 πŸ’¬ 0 πŸ“Œ 0
Preview
Predicting protein complexes in biosynthetic gene clusters Biosynthetic gene clusters (BGCs) are contiguous genomic regions that encode diverse, non-homologous proteins required for the production of specific natural products. Their genetic diversity underlie...

Our new preprint is out. Our group performed a comprehensive protein–protein complex prediction within 2,437 biosynthetic gene clusters. We predicted a total of 487,828 complexes for known BGCs, identifying 15,438 heteromeric interactions with an ipTM β‰₯ 0.6. (2/3)
www.biorxiv.org/content/10.1...

28.10.2025 05:58 πŸ‘ 25 πŸ” 5 πŸ’¬ 1 πŸ“Œ 2
Video thumbnail

Working on the protein-hunter-chai google colab notebook. 😈

@yehlincho.bsky.social

28.10.2025 03:34 πŸ‘ 33 πŸ” 5 πŸ’¬ 0 πŸ“Œ 0