Every Valentine’s Day I ponder the ontological, ontogenic, and phylogenetic basis of the metazoan primary pulsatile organ. Enjoy this old thread (originally posted on twitter many years ago, the import didn’t preserve the dates)
Every Valentine’s Day I ponder the ontological, ontogenic, and phylogenetic basis of the metazoan primary pulsatile organ. Enjoy this old thread (originally posted on twitter many years ago, the import didn’t preserve the dates)
I agree but it’s a big jump outside of comfort zone for some!
You can also find training material, how-to guides, links to tools, tips for making your knowledge base agentic curation hallucination-resistant etc here: ai4curation.io/aidocs/
For staying up to date, I recommend joining the Monarch/OBO Academy (free) and following along with the excellent training material on all things semantics and AI. Find a link to our Slack here: obofoundry.org
The first part was 4 hours, and a mix of foundational basics and hands on activities (thanks to Jonah Cool of @anthropic.com for complimentary Pro accounts!). Slides + recordings here:
doi.org/10.5281/zeno...
If you have more time and are looking for more of a foundational introduction to genAI (with lots of bio examples, and no maths) we are running a training series for members of the @geneontology.bsky.social consortium.
By the way, codex is also great, and we'd love to try and incorporate more material on opencode and other tools, but we have limited time and resources, and more of us were familiar with CC, so we went with that.
For these kinds of workshops it can be a challenge getting everyone set up with both agent code installation AND coordinating subscription access. We found GitHub spaces works great for this, everyone gets a vscode + claude code + skills directly in their browser! github.com/ai4curation/...
We also ran a workshop at ICBO last year "Accelerating Ontology Curation with Agentic AI and GitHub" aimed at ontology developers where we had hands-on session using CC for live ontology editing. www.youtube.com/watch?v=_9Re...
A key message here is that coding agents are not just for code! Yes, most run in the terminal or vscode, and they get a lot of their power from running command line tools. But you don't need to know anything about the command line! Coding agents can edit any kind of file (and any kind of verifier)
Part 2 will be posted here shortly: oboacademy.github.io/obook/course...
As part of the @monarchinitiative.bsky.social /OBO Academy series, we had @christabone.bsky.social give us a two part introduction to "Efficient Biocuration and Bioinformatics with Claude Code". Part 1 (video and hands-on material) is here: oboacademy.github.io/obook/tutori...
What are those tools? I have been waiting for the agent harness that marries the power of a coding agent with a less intimidating UI. There are some great candidates: Goose (has CLI + UI), Claude Desktop, now Claude Co-work. But increasingly I'm recommending: go straight for a coding agent tool!
This week I participated in the excellent @biocurator.bsky.social virtual AI workshop. I presented some general tips for learning about agents. zenodo.org/records/1861... A lot of the advice comes down to: find time to learn+don't wait for the perfect curation tool, start using existing agent tools!
Over the last few months I've been helping organize various tutorials and workshops on agentic AI, aimed mostly at biocurators, ontology developers, and PIs of knowledge bases / data resources. Some of this might be generally useful to folks who don't identify as a 'technical' or an 'AI' person.🧵
To be fair, the main finding was the delta between LLMs alone and LLMs in hands of users: “We identify user interactions as a challenge to the deployment of LLMs for medical advice”. Current models blow away 4o, and likely more forgiving of inexperienced users, but I suspect the delta remains
Claude code in a loop plus some markdown files for skills and agents is all you need!
Ralph Wiggum from the Simpson's looking dopey, with text underneath him saying "I'm helping".
Last year we made a CLI wrapper for different deep research APIs. As a baseline implementation we do a simple Claude Code in a loop. It works rather well!
Well, I discovered there is a name for this pattern: Ralph. We made a Ralph Wiggum deep researcher. monarch-initiative.github.io/deep-researc...
Ontologically, ontogenetically, and phylogenetically, yes
📣 New preprint from us at phagefoundry.org 📣
A solid machine learning framework & to predict strain-level phage-host interactions across diverse bacterial genera from genome sequences alone. Avery Noonan from the Arkin Lab led this massive effort
www.biorxiv.org/content/10.1...
See the thread (from the original arXiv preprint) over on Mastodon: genomic.social/@Cmungall/11...
We developed and evaluated a method to learn python chemical structure classifiers using LLMs. These can give classifications+explanations at runtime. With @jannahastings.bsky.social @justaddcoffee.bsky.social Noel O'Boyle, Daniel Korn, Adnan Malik jcheminf.biomedcentral.com/articles/10....
A busy tool wall in a shed. At the bottom there are instructions saying "Find the 10 hidden enhancers!" Across the wall between the tools are 10 enhancers, represented as DNA helices, but they are difficult to find in the style of a "hidden object" puzzle. Original photo by Lachlan Donald, https://www.flickr.com/photos/lox/9408028555
Hiding in plain sight - how close are we to mapping ALL 🧬enhancers🧬 in the genome?
Our new paper by Mannion et al. takes a systematic look at "hidden enhancers" and why they remain so hard to find. With @mosterwalder.bsky.social, @jlopezrios.bsky.social & many more
www.nature.com/articles/s41...
Check out the pre-print here www.biorxiv.org/content/10.1.... Not sure if the other authors beyond @tkaraletsos.bsky.social are on bsky #CellBiology #AI #GeneSky #genomics #VirtualCell
One super pedantic minor ontological pet peeve is the use of the term "simulation", since that leads me to expect a agent-based or physics-style simulation of cell perturbations. But in fact this pattern could be used for those too! And I guess the terminological horse has long bolted here..
But of course rBio is very cool independent of my nerdy obsession with FMs using ontologies/KGs! This general distillation pattern is likely to be very useful for integrating knowledge with the weights in massive omics FMs..
For another use of ontologies in genomic foundation models, see the recent AlphaGenome paper bsky.app/profile/cmun...
Aside: I find that too many "defenses" of ontologies/KGs in the face of genAI fall back on a kind of GraphRAG use case, where the ontology/KG is used as some kind of bullwark against hallucination. Valid... but they can do so much more! Using as teacher in RL-loop on reasoner traces is v cool!
Table 1. Verifiers used during RL training and their descriptions, as well as example prompts. Verifiers: "Exp" is experimental; "MLP" is multi-layer perceptron; "TF" is Transcriptformer; "GO" is Gene Ontology.
In order to fine tune the reasoner model, the authors used three kinds of soft verifiers in the RL loop - experimental (e.g. CRISPRi knockdown), "simulation" (e.g Transcriptformer), and knowledge-based. For knowledge-based, they used GO @geneontology.bsky.social!
Fig 7 from paper - The figure shows three example responses to the query “Is a knockdown of ISCA2 in RPE1 cells likely to result in differential expression of CEP295?”, each demonstrating different reasoning strategies. Basic answer: ISCA2 is linked to cell cycle progression and DNA repair, so its knockdown could affect CEP295 expression, though experimental data would be needed to confirm directionality. Chain-of-Thought: Provides background—ISCA2 is involved in cell cycle regulation; CEP295 in cilia formation. Knockdown of ISCA2 may influence cell cycle–related genes but there’s no direct evidence connecting it to CEP295 regulation. Self-aware Chain-of-Thought: Notes ISCA2’s role in autophagy and related processes, but emphasizes that its relationship to CEP295 is indirect. Suggests that literature review would be required for confirmation, and stresses the absence of direct experimental evidence, while acknowledging possible indirect effects. Overall, all answers converge on the idea that ISCA2 knockdown could plausibly influence CEP295 but highlight the uncertainty and need for direct experimental validation.
The applications of this are very interesting, allowing for interrogation in natural language, as well as background reasoning over the wealth of biology in the literature. So you can ask what happens to other genes if you knock down a gene in a cell type, and get a biological explanation