Alan Murphy's Avatar

Alan Murphy

@al-murphy

Postdoctoral Research Scientist, Koo lab at Cold Spring Harbor Laboratory | Deep Learning for genomics

739
Followers
610
Following
7
Posts
06.02.2024
Joined
Posts Following

Latest posts by Alan Murphy @al-murphy

Great work Ε½iga! Bit of a technical Q - "Predictions are adapted to the specific organism (human or mouse) by incorporating learned, organism-specific embeddings within these functions" - how were these embeddings learnt? During AlphaGenome training? Also how are they incorporated

17.07.2025 14:26 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Excited to launch our AlphaGenome API goo.gle/3ZPUeFX along with the preprint goo.gle/45AkUyc describing and evaluating our latest DNA sequence model powering the API. Looking forward to seeing how scientists use it! @googledeepmind

25.06.2025 14:29 πŸ‘ 219 πŸ” 81 πŸ’¬ 5 πŸ“Œ 9

Just released tangermeme v0.5.0!

tangermeme implements "everything-but-the-model" for genomic ML Essentially, train your model your way using your code-base (or load someone else's model), and tangermeme handles the discovery + design with it.

Try it out with `pip install tangermeme`.

11.06.2025 16:15 πŸ‘ 22 πŸ” 2 πŸ’¬ 1 πŸ“Œ 1
Post image

We're thrilled to introduce PromoterAI β€” a tool for accurately identifying promoter variants that impact gene expression. 🧡 (1/)

29.05.2025 18:29 πŸ‘ 60 πŸ” 28 πŸ’¬ 1 πŸ“Œ 2
Preview
Programmatic design and editing of cis-regulatory elements The development of modern genome editing tools has enabled researchers to make such edits with high precision but has left unsolved the problem of designing these edits. As a solution, we propose Ledi...

Our preprint on designing and editing cis-regulatory elements using Ledidi is out! Ledidi turns *any* ML model (or set of models) into a designer of edits to DNA sequences that induce desired characteristics.

Preprint: www.biorxiv.org/content/10.1...
GitHub: github.com/jmschrei/led...

24.04.2025 12:59 πŸ‘ 115 πŸ” 37 πŸ’¬ 2 πŸ“Œ 3
Preview
Predicting cell type-specific epigenomic profiles accounting for distal genetic effects - Nature Communications Enformer Celltyping is a genomic deep learning model that predicts epigenetic signals in unseen cell types using distal DNA interactions and chromatin accessibility data. Here, authors show it general...

Completely agree these benchmarks are necessary! This is something we benchmarked against and in some settings sometimes only *just* bet even using enformer as a pretrained model when predicting epigenetic signals across cell types www.nature.com/articles/s41...

12.02.2025 11:16 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

[SAVE THE DATE] MLCB 2025 is happening Sept 10-11 at the NY Genome Center in NYC!

Attend the premier conference at the intersection of ML & Bio, share your research and make lasting connections!

Submission deadline: June 1
More details: mlcb.github.io

Help spread the wordβ€”please RT! #MLCB2025

05.02.2025 02:50 πŸ‘ 41 πŸ” 27 πŸ’¬ 1 πŸ“Œ 4

Great work! Did you look into how well hashFrag scales with large input windows (approaching Enformer/Borzoi receptive fields)? I'm guessing the MPRA data used in the paper must be ~200 bps?

29.01.2025 11:41 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation - Nature Genetics Borzoi adapts the Enformer sequence-to-expression model to directly predict RNA-seq coverage, enabling the in-silico analysis of variant effects across multiple layers of gene regulation.

Super excited to announce our latest flagship model Borzoi: major props to Johannes & David Kelley et al for advancing it. It's been a long journey from our prior Enformer model into this one. A few innovations: i) longer DNA context, ii) adaptation to predict RNA-seq abundance and splice isoforms,

09.01.2025 03:08 πŸ‘ 71 πŸ” 27 πŸ’¬ 2 πŸ“Œ 0

I'm hiring:

1. Research associate (wet-lab w/ phd) to generate mpra perturbation data

2. ML postdoc to build multimodal generative AI for DNA (eg diffusion and LLMs)

3. Bioinformatician (any level) to process and harmonize functional genomics data to train foundation models

DM me if interested!

23.12.2024 20:04 πŸ‘ 24 πŸ” 16 πŸ’¬ 1 πŸ“Œ 2
Figure 1 from the preprint, showing a schematic of the rat rotenone exposure experiments and the assays H3K27ac ChIP-seq and RNA-seq. It also shows the top altered genes from the ChIP-seq analysis in the substantia nigra and cortex in volcano plots.

Figure 1 from the preprint, showing a schematic of the rat rotenone exposure experiments and the assays H3K27ac ChIP-seq and RNA-seq. It also shows the top altered genes from the ChIP-seq analysis in the substantia nigra and cortex in volcano plots.

Just in time for the holidays, we are thrilled to give you the latest preprint from our lab:

Unique nigral and cortical pathways implicated by epigenomic and transcriptional analyses in a rotenone rat model of Parkinson's disease

doi.org/10.1101/2024...

21.12.2024 10:09 πŸ‘ 21 πŸ” 5 πŸ’¬ 1 πŸ“Œ 0
Post image Post image

In a big life update, I successfully defended my PhD thesis - massive thanks to my PI Nathan and assessors @steinaerts.bsky.social @proftomellis.bsky.social . Thrilled to share that I will be joining @pkoo562.bsky.social at CSHL in the new year for a post-doc improving genomic deep learning models!

20.12.2024 17:01 πŸ‘ 13 πŸ” 2 πŸ’¬ 0 πŸ“Œ 2
Research Technician - Cold Spring Harbor Laboratory Post a job in 3min, or find thousands of job offers like this one at jobRxiv!

🚨 We’re hiring! 🚨

The Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory is looking for a Research Technician to join our team.

If you’re passionate about genomics, AI, and experimental science, we want to hear from you.

Help us spread the word!

jobrxiv.org/job/cold-spr...

17.12.2024 01:44 πŸ‘ 8 πŸ” 8 πŸ’¬ 1 πŸ“Œ 0
Preview
Predicting gene expression from histone marks using chromatin deep learning models depends on histone mark function, regulatory distance and cellular states Abstract. To understand the complex relationship between histone mark activity and gene expression, recent advances have used in silico predictions based o

So excited that our work on predicting gene expression from histone modifications using deep learning is out in NAR today. Brilliant to work with lead author @al-murphy.bsky.social and collaborators Aydan Askarova, @borislenhard.bsky.social and Nathan Skene πŸ§¬β­οΈπŸ™
academic.oup.com/nar/advance-...

11.12.2024 17:02 πŸ‘ 70 πŸ” 29 πŸ’¬ 3 πŸ“Œ 0
Preview
DART-Eval: A Comprehensive DNA Language Model Evaluation Benchmark on Regulatory DNA Recent advances in self-supervised models for natural language, vision, and protein sequences have inspired the development of large genomic DNA language models (DNALMs). These models aim to learn gen...

(1/10) Excited to announce our latest work! @arpita-s.bsky.social, @amanpatel100.bsky.social , and I will be presenting DART-Eval, a rigorous suite of evals for DNA Language Models on transcriptional regulatory DNA at #NeurIPS2024. Check it out! arxiv.org/abs/2412.05430

11.12.2024 02:30 πŸ‘ 70 πŸ” 27 πŸ’¬ 1 πŸ“Œ 3

My goal is to understand the regulatory role of every nucleotide in the genome, and how this changes across every cell in the human body.

If you are interested in doing a Ph.D. with me at UMass Chan Medical (Genomics and Comp Bio Department), see the links below. Deadline is Dec 1st.

18.11.2024 18:22 πŸ‘ 122 πŸ” 43 πŸ’¬ 6 πŸ“Œ 5

Massive thanks to all co-authors for their work on this William Beardall, @marekrei.bsky.social, Mike Phuycharoen and Nathan Skene.

18.11.2024 08:57 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Enformer Celltyping’s predictions capture cell type-specific genetic enrichment for complex traits - a Heatmap of stratified LD score regression (s-LDSC)73 analysis for genetic variants associated with brain and immune diseases/traits and behavioural traits (sourced from associated GWAS) displayed as false discovery rate (FDR) value for significance of enrichment for ATAC-Seq chromatin accessibility signal, H3K27ac signal and Enformer Celltyping’s (EC) predictions of H3K27ac for microglia, neurons and oligodendrocytes (oligoden.). b -log10(FDR) genetic enrichment for the complex traits from s-LDSC. c Proportion of peaks in derived peak files used for s-LDSC analysis. The median, minima and maxima foe the violin plots were Monocyte; 0.768, 0.380, 0.934, Neutrophil; 0.720, 0.475, 0.925 and T-Cell; 0.737, 0.159, 0.924.

Enformer Celltyping’s predictions capture cell type-specific genetic enrichment for complex traits - a Heatmap of stratified LD score regression (s-LDSC)73 analysis for genetic variants associated with brain and immune diseases/traits and behavioural traits (sourced from associated GWAS) displayed as false discovery rate (FDR) value for significance of enrichment for ATAC-Seq chromatin accessibility signal, H3K27ac signal and Enformer Celltyping’s (EC) predictions of H3K27ac for microglia, neurons and oligodendrocytes (oligoden.). b -log10(FDR) genetic enrichment for the complex traits from s-LDSC. c Proportion of peaks in derived peak files used for s-LDSC analysis. The median, minima and maxima foe the violin plots were Monocyte; 0.768, 0.380, 0.934, Neutrophil; 0.720, 0.475, 0.925 and T-Cell; 0.737, 0.159, 0.924.

A key finding was the current limitations of such models at genetic variant effect prediction - the same as others have found, like Ioannidis & Mostafavi labs. Despite this, Enformer Celltyping can also be used to study cell type-specific genetic enrichment of complex traits.

18.11.2024 08:57 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Predicting cell type-specific epigenomic profiles accounting for distal genetic effects - Nature Communications Enformer Celltyping is a genomic deep learning model that predicts epigenetic signals in unseen cell types using distal DNA interactions and chromatin accessibility data. Here, authors show it general...

Delighted to share our work to develop a genomic DNN, Enformer Celltyping, to accurately predict epigenetic signals in previously unseen cell types has now been published doi.org/10.1038/s414...

18.11.2024 08:57 πŸ‘ 14 πŸ” 4 πŸ’¬ 1 πŸ“Œ 1
Preview
Predicting cell type-specific epigenomic profiles accounting for distal genetic effects - Nature Communications Enformer Celltyping is a genomic deep learning model that predicts epigenetic signals in unseen cell types using distal DNA interactions and chromatin accessibility data. Here, authors show it general...

Extending pretrained LM-inspired architectures for genome modeling and releasing a tool for predicting epigenetic signals while being cell type-agnostic. Happy to be a co-author on this excellent paper by @Al_Murphy_ , now in Nature Communications. www.nature.com/articles/s41...

18.11.2024 08:54 πŸ‘ 3 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Jessica Zhou (@zrcjessica) is a talented postdoc now on the job market looking for ML/data science industry positions in the NY area! If you have an open position reach out!

Please Repost to spread the word! πŸ™πŸ»

16.11.2024 22:07 πŸ‘ 17 πŸ” 6 πŸ’¬ 0 πŸ“Œ 0

🧬 Genomic DNNs can be trained to learn a lot of different aspects of gene regulation, but they're not perfect and we don't know which predictions are reliable and which ones aren't.

We introduce DEGU: Uncertainty-aware Genomic Deep Learning with Knowledge Distillation. 1/n

16.11.2024 16:14 πŸ‘ 34 πŸ” 9 πŸ’¬ 1 πŸ“Œ 1
Preview
Perspective on recent developments and challenges in regulatory and systems genomics Predicting how genetic variation affects phenotypic outcomes at the organismal, cellular, and molecular levels requires deciphering the cis-regulatory code, the sequence rules by which non-coding regi...

Perspective on recent developments and challenges in regulatory and systems genomics
arxiv.org/abs/2411.04363

Great perspective piece from an all-star author list on the current state of regulatory genomics and the challenges ahead.

08.11.2024 18:22 πŸ‘ 21 πŸ” 10 πŸ’¬ 1 πŸ“Œ 0