ππ€£π€ͺ
ππ€£π€ͺ
Iβm not sure what face Iβm making in that screenshot (lol) but Iβm glad you enjoyed! (Funny faces notwithstanding)
A chat interface showing a conversation starting with a user saying "Are there any talks about evals" and a response "Yes, Simon Couch just gave one, and he hopes you enjoyed it." There's a teddy bear mascot wearing a teal shirt and cap in the bottom right corner of the chat window.
A picture from the side of a conference session room, with myself standing in front.
Just wrapped up a talk at #positconf2025 about LLM evaluation with R! Such a joy to hang out with #rstats folks in person and hear about what others are working on.
Resources and slides: github.com/simonpcouch/...
my most recent hot take: after a year working inside my hospitalβs IT system, iβm increasingly convinced that clinical informatics should have been a residency (like pathology or radiology), not a fellowship.
idk what it means that so many people I know to be thoughtful, compassionate, and brilliant physicians have told me that - while they love taking care of patients - the growing frustrations and exhaustion they experience in clinical life make the whole thing feel unsustainable in the long-term
as i prepare to go back to med school, i am haunted daily by the number of my med school classmates (who are now residents, fellows, or attendings) who have reached out to me to learn about my side-career in health tech because they want to explore alternatives to a full-time clinical career
how are you such a legend - this is amazing!
all this talk about AI and iβm still sitting here thinking about the warm, familiar embrace of OLS
today i got an email in which a statistician cited a paper from over 200 years ago and all i could think is how thatβs the kind of power computer scientists wish they had
I feel like every paper I read about LLMs in health care takes for granted that the encouraging, but not necessarily transformative, results weβve seen so far are going to scale upward as LLMs improve. But are we really safe to assume that LLMs will keep getting better? Iβm not so sure.
anyone who made me learn the brachial plexus will now be forced to understand gradients, sorry i donβt make the rules
Meme of a still from a Dr. Phil interview. An unamused girl is pictured with a banner reading βaneska says violence works for herβ
forced my physician collaborators to listen to me talk about partial derivatives today
I can already tell that this paper is going triple-platinum in the Keyes household. Thank you for sharing!
Itβs finally out! We brought a multidisciplinary team of physicians, computer scientists, and engineers to red team LLMs for healthcare uses. And we have shared the dataset! www.nature.com/articles/s41...
disclaimer: I havenβt finished medical school yet (i will someday!) so this is mostly stolen valor, but even having a small amount of clinical training has imo helped me so much to understand small details that my technical colleagues donβt really notice or care about
iβm working on a few really cool applications of LLMs in health right now (in the real world, in a deployment environment) and my most valuable asset by far in this work isnβt being the most technical person on the team (iβm not); itβs knowing just enough about medicine to ask the right questions
AI Grand Rounds Episode 27 From Clinical Notes to GPT-4: Dr. Emily Alsentzer on Natural Language Processing in Medicine
Dr. @emilyalsentzer.bsky.social, a Stanford faculty member and expert in clinical #AI, discusses the evolution of natural language processing, the challenges of AI in clinical settings, and what the future holds for open-source medical AI. Full episode: nejm.ai/4gOGeSo
#MedSky #MLSky
"What I find hard to reconcile is, on the one hand, we want to not fall behind on AI writ large. And on the other hand, the very people we need to ensure that agility are being let go.ββNigam Shah @stanfordhai.bsky.social
i have to respect how non-technical folks use spreadsheets - they give you these beautiful murals with all kinds of colors and spacing and intricate patterns. i must say it really breaks my heart to immediately flatten it all with read_csv
(i appreciate the use of benchmarks and think theyβre really valuable, but benchmarks must be drawn from some population of possible observations, right? so there will still be sampling error in any metrics computed using a benchmark, even if all models use the same benchmark)
iβve been trying to understand why reporting some estimate of confidence intervals around performance metrics (or null hypothesis significance testing) is not more common in the machine learning/AI literature. i think this is changing, but thereβs still a lot of weird (or missing) statistics ime
i think the vast ambiguity in the term βdata scientistβ causes a lot of headaches. am i an engineer? am i a computer scientist? am i a statistician? who knows!!!
started my day writing a few (simple) statistical proofs for a data science project (to justify simplifying a calculation from something complicated to something simpler and equivalent) and wow it really is nice to dust off the old PhD and put it to use every once in a while
This was such a wonderful read - thanks for sharing it @emilyriederer.bsky.social! Iβve been using poetry for package development recently, but now Iβm eager to try uv!π€ (And seaborn.objects might be a nice alternative to plotnine too!)
This is really cool! Congrats and thanks for sharing!
i rounded on stanfordβs palliative medicine service today, and it was such a powerful reminder that end-of-life care clinicians are some of the most kind and empathetic people in health care. truly in awe of their ability to bear witness to (and guide people through) such difficult moments
Thanks for sharing! Iβm excited to bring this to our group for journal club.
when someone tells you that theyβve solved an important problem using ai, thatβs very exciting! but you should not take their word for it. if they donβt provide very transparent tools for monitoring/evaluating their ai product, itβs pretty safe to assume even *they* donβt know if it actually works
i am not an ai doomer by any stretch, but building ai systems (big or small) that solve useful problems in health care is really hard! validation experiments are really hard (and sometimes really expensive) to do! very few ai products work exactly as advertised, and many donβt work at all!