Ivar Grytten's Avatar

Ivar Grytten

@ivargrytten

Bioinformatics, Python

16
Followers
13
Following
7
Posts
27.09.2023
Joined
Posts Following

Latest posts by Ivar Grytten @ivargrytten

Preview
GitHub - bionumpy/bionumpy: Python library for array programming on biological datasets. Documentati... Python library for array programming on biological datasets. Documentation available at: https://bionumpy.github.io/bionumpy/ - GitHub - bionumpy/bionumpy: Python library for array programming on b...

Some shameless BioNumPy (github.com/bionumpy/bio...) advertisement in the end: This project would not have been possible without it. I love how simple it is now to just read VCFs, process genotype matrices, read FASTQ files, compute kmers, etc, which has enabled fast prototyping and experimentation.

25.12.2023 13:48 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

We can maybe forget about high read coverage. There is almost no accuracy gain in going from 5x to 30x coverage. This might be because imputation is such a big part of the prediction model, meaning that 5x is more than enough to guide the model in the right direction.

25.12.2023 13:48 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Pangenome size matters -> we should as a community invest in making larger pangenomes. This is maybe somewhat obvious, but nice to get it confirmed. X-axis is number of individuals in pangenome.

25.12.2023 13:47 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

SNPs/indels are important when genotyping SVs. Our experiments show that SV genotyping accuracy drastically increases when we add more SNPs/indels to the pangenome. The x-axis in the plot below is allele frequency - SNPs/indels with freq lower than x-axis value are filtered away.

25.12.2023 13:46 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

We were surprised by how good GLIMPSE is at imputing SVs! We ended up simply relying on GLIMPSE in KAGE2, rather than using our own imputation model. Really appreciate those rare moments when existing bioinformatics tools actually work seamlessly together to make good results.

25.12.2023 13:45 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Genotyping SVs from reads alone yields much lower accuracy than when combined with imputation. Even KAGE/PanGenie with very few reads (0.5x) perform much better than e.g. BayesTyper (30x coverage) that does not do imputation.

25.12.2023 13:45 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
KAGE 2: Fast and accurate genotyping of structural variation using pangenomes bioRxiv - the preprint server for biology, operated by Cold Spring Harbor Laboratory, a research and educational institution

KAGE2 is out! Enables very fast and accurate genotyping of structural variants using pangenomes: www.biorxiv.org/content/10.1.... I’ve spent the last 6+ months going deep into the SV rabbit hole, and had some surprises I thought it’s worth to also share (1/6)

25.12.2023 13:44 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 1
Post image

πŸŽ‰ Happy to share my new preprint in which we present LIgO β€” a powerful tool to simulate adaptive immune receptor (AIR) and repertoire (AIRR) data for the development and benchmarking of AIRR-based ML

www.biorxiv.org/content/10.1...

Try LIgO now! πŸš€
github.com/uio-bmi/ligo

24.10.2023 11:59 πŸ‘ 9 πŸ” 3 πŸ’¬ 0 πŸ“Œ 1