Johan Nystrom-Persson (@jtnystrom)

Slacken and Discount are now available in Bioconda We are happy to announce that Slacken and Discount (details) are now finally available in the Bioconda package system. This should make them much easier to install and use for many people.

Slacken (metagenomic profiler) and Discount (k-mer counter) are now available in Bioconda. Both are Spark-based and designed for extreme scalability. jnpsolutions.io/2026/03/05/s...

05.03.2026 06:24 👍 0 🔁 0 💬 0 📌 0

How would you design a *multithreaded*, *concurrent* & *dynamic* hash table if you are focused specifically on common k-mer workloads, where streaming query & insertion are common? Jamshed, Prashant and I explore this in kache-hash, a cache-friendly k-mer hash table!
www.biorxiv.org/content/10.6...

17.02.2026 18:49 👍 20 🔁 13 💬 0 📌 0

Every reputable expert I know considers mRNA vaccine technology to be one of the most revolutionary advances in medicine in our lifetimes. Its inventors won the Nobel Prize in 2023. Shutting it down now is pointless self-harm to humanity.

05.08.2025 22:54 👍 18078 🔁 6741 💬 547 📌 283

Slacken paper published in NAR Genomics and Bioinformatics Our paper on Slacken, a metagenomic classifier based on the Kraken 2 method, was recently published in NAR Genomics and Bioinformatics. In this post I (Johan) would like to break down the main results...

Explaining in simple terms the two main results achieved by our #metagenomic classifier #Slacken - scaling independently of RAM, and sample-tailored libraries. jnpsolutions.io/2025/07/03/s...

03.07.2025 04:55 👍 0 🔁 0 💬 0 📌 0

GitHub - JNP-Solutions/Slacken: Highly scalable implementation of the Kraken 2 genomic sequence classification method. Based on Apache Spark. Highly scalable implementation of the Kraken 2 genomic sequence classification method. Based on Apache Spark. - JNP-Solutions/Slacken

Slacken is available on GitHub (github.com/JNP-Solution...) and reference libraries are available on S3 thanks to AWS Open Data sponsorship.

Feedback very welcome. I'd be happy to answer any questions or assist people in getting started. We want Slacken to be as accessible as possible.

18.06.2025 02:38 👍 0 🔁 0 💬 0 📌 0

2) We show that dynamically tailoring a genomic reference library to the samples being classified greatly increases the fraction of species and strain level classifications (making them more specific) as well as improving Bracken quantification.

18.06.2025 02:38 👍 0 🔁 0 💬 1 📌 0

1) We introduce a new implementation of the Kraken 2 method on Apache Spark, which has comparable cost-performance when classifying multiple samples.

18.06.2025 02:38 👍 0 🔁 0 💬 1 📌 0

Precise and scalable metagenomic profiling with sample-tailored minimizer libraries Abstract. Reference-based metagenomic profiling requires large genome libraries to maximize detection and minimize false positives. However, as libraries g

Excited to announce our paper "Precise and scalable metagenomic profiling with sample-tailored minimizer libraries". academic.oup.com/nargab/artic...

#metagenomics #kraken2 #slacken

18.06.2025 02:37 👍 0 🔁 0 💬 1 📌 0

Particularly with a focus on making the software accessible for people with no Spark experience.

14.04.2025 06:52 👍 0 🔁 0 💬 0 📌 0

Is there any preferred solution for packaging and shipping software based on Apache #Spark, other than Docker images? I found a site called spark-packages.org but that doesn't look like it's been updated for a long time. #bigdata #jvm

14.04.2025 06:48 👍 0 🔁 0 💬 1 📌 0

I can imagine that people who are entering into software development now might get the false impression that there's only accidental complexity and AI is our only hope to temper it. But you only get to understand simplicity by developing your own taste for it (by fighting complexity for long enough)

04.04.2025 06:54 👍 1 🔁 0 💬 0 📌 0

One challenge I think young people are facing is that you have to wade through so much accidental complexity before you start seeing the light. It's only in my mid 30's that I think I understood how to value simplicity and elegance. Before that I was not seeing the forest for the trees a lot.

04.04.2025 06:52 👍 1 🔁 0 💬 1 📌 0

Why I stopped using AI code editors · Luciano Nooijen In the past I used AI code editors for all of my programming, but I stopped using it and recommend others to consider this as well

“Why I stopped using AI code editors”

The article is spot on. I've gained my intuition & expertise, aka good taste in software engineering, by suffering through learning and taking care of the nitty-gritty while thinking of abstraction and reuse.

lucianonooijen.com/blog/why-i-s...

04.04.2025 06:11 👍 8 🔁 3 💬 2 📌 0

CDC datasets uploaded before January 28th, 2025 : Centers for Disease Control and Prevention : Free Download, Borrow, and Streaming : Internet Archive An archive of all CDC datasets uploaded to https://data.cdc.gov/browse before January 28th, 2025. Excludes corrupt datasets and data not publicly accessible.

CDC datasets have been saved. But you can still help by seeding.

02.02.2025 04:38 👍 1251 🔁 632 💬 21 📌 30

Think of a number. My feed was recently clogged up with news articles reporting that Sam Altman thinks that AGI is here, or will be here next year, or whatever. I will refrain from giving even more air to this nonsen…

Number theorists: please get in touch. xenaproject.wordpress.com/2025/01/20/t...

20.01.2025 14:21 👍 28 🔁 11 💬 3 📌 6

This resolves an inherent conflict between scalability and precision in Kraken 2.

21.01.2025 05:45 👍 0 🔁 0 💬 0 📌 0

Precise and scalable metagenomic profiling with sample-tailored minimizer libraries Reference-based metagenomic profiling requires large genome libraries to maximize detection and minimize false positives. However, as libraries grow, classification accuracy suffers, particularly in k...

Our new #metagenomics paper. What if the Kraken 2 library was specifically built for the samples being classified, every time? We show that this improves precision significantly while also being surprisingly cheap, using a new clone of Kraken 2 based on Apache Spark. www.biorxiv.org/content/10.1...

21.01.2025 02:17 👍 2 🔁 0 💬 1 📌 0

Precise and scalable metagenomic profiling with sample-tailored minimizer libraries https://www.biorxiv.org/content/10.1101/2024.12.22.629657v1

25.12.2024 16:34 👍 1 🔁 1 💬 0 📌 0

Johan Nystrom-Persson

Latest posts by Johan Nystrom-Persson @jtnystrom