Excited to share this preprint that describes my latest work on using GPUs to accelerate processing of RNA-seq data.
The title says it all: "RNA-seq analysis in seconds using GPUs" now on biorxiv www.biorxiv.org/content/10.6... and github github.com/pachterlab/k...
Figure 1 shows they key result
06.03.2026 19:32
π 181
π 86
π¬ 6
π 8
Presentation of scientific work on De Bruijn Graphs applied to the processing of sequencing data in the context of biology. The picture was taken in the conference room of the University of Venice, where a screen displays a slide that introduces De Bruijn Graphs, with the speaker standing in front of it. Being the screen is a large renaissance painting that spans from the floor to the roof.
I had the occasion of presenting nice results about the detection of biological events in De Bruijn Graph at #DSB2026, in the context of my PhD work on #Vizitig !
Thanks to the organizers and colleagues for this amazing and super-inspiring event (and @camillemrcht.bsky.social for the picture).
20.02.2026 18:34
π 16
π 7
π¬ 1
π 0
Beautiful caveat section !
15.02.2026 16:01
π 1
π 0
π¬ 1
π 0
Pour l'importance des pesticides dans l'incidence des cancers, voyez plutot ceci. Les expositions professionnelles (amiante, benzene) sont dans la barre bleue Γ droite, et les pesticides n'apparaissent nulle part faute de donnΓ©es suffisantes.
www.nature.com/articles/s41...
11.02.2026 08:58
π 2
π 0
π¬ 0
π 0
π π¨
09.02.2026 12:06
π 0
π 0
π¬ 0
π 0
PREPRINT ALERT
I heard you craving for more combinatorics, here are some more for y'all !
04.02.2026 17:22
π 5
π 4
π¬ 0
π 1
Pour l'importance des facteurs de risque de cancer, voyez plutΓ΄t ceci. La petite zone bleu clair, ce sont toutes les causes professionnelles: amiante, arsenic, etc. Les pesticides n'apparaissent nulle part faute de donnΓ©es suffisantes.
Source: Fink et al. Nature Medicine, 2026
04.02.2026 14:20
π 0
π 0
π¬ 0
π 1
More minimizer papers! π
04.02.2026 10:43
π 3
π 2
π¬ 1
π 0
Stay tuned: We are now running Metapuccino on SRAβs 1 million human transcriptomes.
02.11.2025 10:14
π 2
π 1
π¬ 0
π 0
This ms. covers the full methodology and discusses the limits of NLP and LLMs for NGS metadata completion.
02.11.2025 10:14
π 0
π 0
π¬ 1
π 0
Usability was a top priority: Metapuccino runs on regular computers with open-source LLMs, but can also scale up on GPUs for large datasets. All it needs is a list of SRA IDs β no pre-processed tables required.
02.11.2025 10:14
π 0
π 0
π¬ 1
π 0
Fiona Hak developed a clever LLM training strategy using the hardest SRA cases β the fine-tuned model is available on Hugging Face.
02.11.2025 10:14
π 0
π 0
π¬ 1
π 0
Metapuccino fills and standardizes 19 key SRA metadata fields in human transcriptomics, using rule-based NLP and a large language model (LLM).
02.11.2025 10:14
π 0
π 0
π¬ 1
π 0
Even simple tasks, like selecting tumor vs. normal samples for a cancer type, require expert curation across multiple tables, protocols, and abstracts.
02.11.2025 10:14
π 0
π 0
π¬ 1
π 0
NCBIβs SRA is a fantastic resource for studying the human transcriptome. But its metadata is messy β over 70% of fields are empty, and information is often inconsistent.
02.11.2025 10:14
π 0
π 0
π¬ 1
π 0
PostDoc position in bioinformatics and artificial intelligence. PDF available upon request.
Interested in #lncRNA and #ArtificiaIntelligence?
In the frame of our recently founded French-Korean bilateral project DHARP, we are recruiting a post-doc in bioinformatics and artificial intelligence in our team at
@ips2parissaclay.bsky.social
Application limit: 01/12/2025
22.10.2025 15:27
π 3
π 1
π¬ 0
π 0
PubMed is running on autopilot during shutdown, but key independent committee has been abolished www.bmj.com/content/391/... π§ͺ
22.10.2025 17:55
π 7
π 9
π¬ 0
π 2
Illustration of Burrows-Wheeler Transform and many auxiliary structures from the input string how$now$brown$cow$#
New tool "bwt-svg" for making illustrations of the BWT and the many auxiliary arrays and other structures related to it. Pyodide-based no-installation-necessary interface here: benlangmead.github.io/bwt-svg/. (H/t to @robert.bio for pointing me to pyodide!) Full repo: github.com/benlangmead/....
14.10.2025 20:48
π 40
π 21
π¬ 4
π 1
The MSc. Bioinformatics students of U. Paris-Saclay are organizing the Junior Conference on Computational Biology (JC2B) 2025: AI and predictive models in bioinformatics
November 13, 2025 - I2BC, CNRS, Gif-sur-Yvette, France
Register for free : bioi2.i2bc.paris-saclay.fr/jc2b/#regist...
01.10.2025 12:38
π 2
π 2
π¬ 0
π 0
π¦ π§ββοΈFrom bacterial to human immunity.
We report in @science.org the discovery of a human homolog of SIR2 antiphage proteins that participates in the TLR pathway of animal innate immunity.
Co-led wt @enzopoirier.bsky.social by D. Bonhomme and @hugovaysset.bsky.social
www.science.org/doi/10.1126/...
24.07.2025 18:22
π 262
π 122
π¬ 9
π 11
Congratulations to Rayan Chiki, (Institut Pasteur) head of the βSequence Bioinformaticsβ unit, for securing the ERC Proof of Concept 2025 for his project ENZYMINER! π
βͺ@rayan.chiki.bsky.social
#Bioinformatics
24.07.2025 15:10
π 60
π 13
π¬ 4
π 2
New ENCODE4 long-read RNA-seq transcripts track forΒ hg38 and mm10. Triplets (e.g. [1,1,3]) indicate start site, exon combination, and stop site for each transcript. Enrichment scores show how these change across tissue and cell line samples.
Read more: genome.ucsc.edu/gold...
16.07.2025 18:27
π 26
π 7
π¬ 0
π 1
#JOBIM2025 Mathilde Girard ends the session with a simple but effective idea: re oder the reads before using an off the shelf compressor to improve compression gain
10.07.2025 09:34
π 6
π 3
π¬ 0
π 0
#JOBIM2025 @bdegardins.bsky.social presents his PhD work on Vizitig, a multi sample graph exploration tool, with a focus on RNA - this afternoon we'll do a demo on pangenomes with the same tool
10.07.2025 08:42
π 10
π 5
π¬ 1
π 0
OReO: optimizing read order for practical compression
AbstractMotivation. Recent advances in high-throughput and third-generation sequencing technologies have created significant challenges in storing and mana
Paper alert!
We present Oreo a tools that reorder long reads datasets in a way to compress them efficiently with ANY universal compressor like gz, zstd, xz ...
TLDR: You can get state of the art compression WITHOUT a dedicated compressor/decompressor!
academic.oup.com/bioinformati...
A thread!
03.07.2025 10:52
π 23
π 18
π¬ 1
π 1
Preprint alert from the group π¨ super fast grep-like sequence selection
02.07.2025 13:38
π 6
π 5
π¬ 0
π 0