New post, on whether I could get Claude Code to complete a data task that had taken me AGES a decade agoβ¦
kucharski.substack.com/p/how-much-t...
New post, on whether I could get Claude Code to complete a data task that had taken me AGES a decade agoβ¦
kucharski.substack.com/p/how-much-t...
AI has huge promise for genomics -- but it has consistently failed at microbiome-based prediction.
My new post on why simple models keep winning, where deep learning actually earns its place, and where the field is headed
blekhman.substack.com/p/ai-keeps-f...
Courtesy of @martibartfast.bsky.social , we have a new release of AllTheBacteria which adds another 322,920 assemblies, covering all ENA (illumina, isolate) prokaryotes to May 2025.
allthebacteria.readthedocs.io/en/latest/ov...
1/ LLMs are great at text extraction, but sometimes they hallucinate. A simple way to catch hallucinations is to check if the extracted text actually exists in the source. Turns out this is harder than it sounds. (new paper with Aaron Streets)
www.biorxiv.org/content/10.6...
This article is now published! academic.oup.com/nargab/artic...
Weβve added a few new analyses. First off, we show that, while gene presence absence variation (PAV) scales with evolutionary distance in both plants and animals, the base level and rate of accrual are both twice as high in plants.
π¦ π§¬π₯οΈ Bakta v1.12.0 is out
with tons of tiny improvements and bug fixes, too many to list all:
- partial genes on linear seqs
- improved errror handlings & runtimes
- support Python 3.12 & 3.13
- ...
A huge shout out and thank you to all bug reporters and contributors!
github.com/oschwengers/...
OOOOH
Researchfish
Is going extinct π‘ππ
help.researchfish.com/en_US/resear...
Deadline for registration is Jan 31. Donβt miss it! See you all in beautiful Venice π
Too many meta-analyses have findings equivalent to: βIf you average the cost of a loaf of bread, car insurance for a year and a movie ticket, you get $752.36β
View from the hotel room
Poster session 2024, with Valentina Boeva, Constantin Ahlmann-Eltze and others
Wednesday afternoon hike incl. swim in the mountain river
Another view from the hotel room
Apply for the Ascona workshop "Statistical and AI methods for multi-modal multi-scale modeling of biological systems", 28 Jun-3 Jul 2026 on Monte VeritΓ , Lago Maggiore at the foot of the Swiss Alps.
ascona2026.sciencesconf.org
𧬠Free @microbesng.bsky.social PopUp ECR symposium π§¬
UK-based MRes or PhD student, research assistant or postdoc working on microbial genomics or metagenomics?
Present your work and/or chair a session, plus excellent keynote speakers: @halllab.bsky.social and @alexmsalmeida.bsky.social.
#MNGPopUp
I don't know how United Nations, World Bank or G20 events go ahead properly in the US now
It's not safe for global participants to travel to the US, and in any case, State Dept just froze visa processing for 75 countries
www.reuters.com/world/us/us-...
I proudly identify as a fish now.
Snowy Penglais January 2026
Snowy Llandinam building January 2026
The Aberystwyth University festive reindeer with snowy coats, January 2026
Blwyddyn Newydd Dda! Happy New Year!
Blogpost: I took an oral probiotic for a month and did microbiome sequencing at @Plasmidsaurus.
blog.booleanbiotech.com/oral-microbi...
A picture is worth 1000 words...
This appeared on the BBC News today, showing the increase in solar electric generation in the UK.
Not sure who produced it, but genuinely think this is a genius piece of scientific communication - the construct and choice of colour scale is near-perfect.
Chapeau!
These are pedagogical problems in addition to the commonly known ones (bias, hallucinations, etc). The one I dislike the most is the distrust it causes and the adversarial environment that is created, causing us to limit and invigilate assessment instead of setting more expansive challenges.
Hannah Dee and I did a talk about 6 of the pedagogical problems when using generative AI in the classroom, at our university's teaching and learning conference at the end of December. She's written an excellent summary of the talk on her blog www.hannahdee.wales/blog/?p=2039
Releasing alignism, a small tool that I have found useful for doing multiple sequence alignment in browser.
hgbrian.github.io/alignism/
- The hard work was done by the awesome biowasm team!
- Does tree building too
- V fast compared to e.g., muscle on EBI
- Not tested that much!
πΎ any2fasta 0.8.1 is released!
The FASTA format is now 40 years old (Pearson & Lipman) and any2fasta makes it easy for your scripts and pipelines that accept FASTA to also accept other formats, even if compressed! eg. .gbk.gz
#bioinformatiocs #microbiology #genomcs
github.com/tseemann/any...
And she would probably have been in her late twenties when the photos were taken. The hair style makes her look much older.
Hey folks, am looking for examples of circularised/full plasmid sequences from "unusual " bacterial species, sequenced since 2020 (as independent validation for a plasmid identification tool that was trained on refseq2020+plsdb). Any tips? #microsky
Closing out my year with a journal editor shocker π§΅
Checking new manuscripts today I reviewed a paper attributing 2 papers to me I did not write. A daft thing for an author to do of course. But intrigued I web searched up one of the titles and that's when it got real weird...
Reindeer by the Penglais reception
Christmas tree by the Penglais reception
Nadolig Llawen! Happy Christmas!
An excellent day yesterday at the @aberdlsagb.bsky.social #FestivalOfResearch. Some great presentations and posters. Super proud to supervise @coreyasteele.bsky.social (Oral), @angelrumney.bsky.social (Poster) and Michela (Poster). Even time for a #FlukeMap group photo (some are missing)
The (classical) starting point of the story is the size evolution of the SRA, which has grown so large that it is barely usable, and may become unmaintainable in the near future...
Plot From www.ncbi.nlm.nih.gov/sra/docs/sra..., that have not been updated since 2024, for unknown reasons...
πΎ Prokka 1.15.6 is released!
This is the last major release of Prokka. But don't be sad, because @oschwengers.bsky.social already has an excellent replacement called Bakta you can migrate to.
#bioinformatics #microbiology #genomics
github.com/tseemann/pro...
course schedule as a table. Available at the link in the post.
I'm teaching Statistical Rethinking again starting Jan 2026. This time with live lectures, divided into Beginner and Experienced sections. Will be a lot more work for me, but I hope much better for students.
I will record lectures & all will be found at this link: github.com/rmcelreath/s...
The scikit-bio paper in online in Nature Methods! Many thanks to our collaborators, community contributors and reviewers! We couldnβt have done it without you. www.nature.com/articles/s41... #Bioinformatics #OpenSource
The 12th edition of the 2-days workshop βData Structures in Bioinformaticsβ (DSB) will take place in Venice (Italy) on February 18-19th, 2026: dsb-meeting.github.io/DSB2026/