Data Science Weekly - Issue 642, by @DataSciNews open.substack.com/pub/datascie...
Data Science Weekly - Issue 642, by @DataSciNews open.substack.com/pub/datascie...
’ve been using Claude Code to take care of scrappy data cleaning tasks for a while. These days though, I’m using Codex as my coding agent. Similar to what I did with Claude, I’ve been “fine-tuning” Codex CLI to work on a few different vaguely defined tasks like classification, voting, filtering, or ranking. The pattern in this post works surprisingly well when you have the following conditions: Loosely defined open-ended tasks. e.g., tagging tweets with a set of predefined labels, extracting structured information from a GitHub issue, … Powerful agentic capabilities. Doing the task requires something more than a simple llm call or PydanticAI script. e.g., using gh api CLI to get the number of stars of a repository. Structured outputs. You need a response in a certain shape! This is something codex exec can do that claude couldn’t and is really powerful. e.g., return exactly True or False and nothing else. Save money! Unlike llm or other tools/libraries that require an OPENAI_API_KEY, Codex can use your ChatGPT subscription, making things “free”.
I've been using this pattern to "specialize" Codex for vaguely defined tasks like classification, filtering, soft sorting, ...
davidgasquez.com/specializing...
Made more than 10,000 invocations so far (reusing my ChatGPT subsciption) and am really happy with the pattern!
I tracked every keyword in 22 years of Cosyne abstracts to map how computational neuroscience evolved — from Bayesian brains to neural manifolds to LLMs — and where it's heading next.
so hard to *really* comprehend
In one week I'll be talking about tips for reproducible R code and why science would love you to try these tips on your own code too 🧪😍🌏
It's an online talk, so feel free to watch comfortably from your couch. Hope to see you there!
@sortee.bsky.social #rstats
events.humanitix.com/sortee-webin...
I knew it. This confirms what I knew all my life. I may have Aphantasia (I do ...) but I see colors exceptionally well.
www.keithcirkel.co.uk/whats-my-jnd...
I've never built anything for a decade professionally, but here we are! blog.marcua.net/2026/03/12/b...
But is it far enough away to start running in the opposite direction? (or at least try to get behind some heavy-duty stuff?)
98 million videos for a grapefruit video. (watched it twice).
Takeaway - the secret to life is caring more about something than anybody else.
In this months' blog post, we’ll explore how to add vector layers and legend in a map with QGIS. step by step here: www.miriam-lerma.com/blog/2026-03...
A new blog post! In which I discover that even Claude Code has its limits, certainly when it comes to replacing data engineers
👉🏻 rmoff.net/2026/03/11/c...
(There's also a companion post if you like poking around Claude session logs to see what it's up to: rmoff.net/2026/03/11/c...)
Figure shows the proportion of successful putts by distance (where we have integrated out the missing distances) and geometrical model 2 (as presented in https://users.aalto.fi/~ave/casestudies/disc_putting/disc_putting.html) based putting probability by distances based on data for top 33 PDGA MPO players.
I've made a geometrical model for disc golf putting with uncertainty in 2D angle and distance control.
Based on the model, the putting angle accuracies of top PDGA MPO and FPO players are about 1° and 1.4°, respectively. See more at users.aalto.fi/~ave/casestu...
Data Science Weekly - Issue 641, by @DataSciNews open.substack.com/pub/datascie...
Congratulations, Bruno!
I rounded up a few Claude Skills for #RStats users.
Huge thanks to the creators who developed them. They share Skills for everything from tidyverse code to brand.yml files to learning while using AI.
Hope the list is useful, and please let me know what I missed! 🧡
rworks.dev/posts/claude...
You can now visualize how a color palette distributes across OKHsv, OKHsl, OKLCh and CIELab, and compare 6 distance metrics side by side. Zero dependencies, raw WebGL2.
Took me 3 years to make something I wasn't too embarrassed to share 🙃
meodai.github.io/color-palett...
Data Science Weekly - Issue 640, by @DataSciNews open.substack.com/pub/datascie...
Data Science Weekly - Issue 639, by @DataSciNews open.substack.com/pub/datascie...
Data Science Weekly - Issue 638, by @DataSciNews open.substack.com/pub/datascie...
Simulated null distribution for data with a sample size of 100, difference in group means of 5, and a p-value of 0.142
Simulated null distribution of a slope of 0.8 and p-value of 0.002
Finally, we have to decide if the p-value meets an evidentiary standard or threshold that would provide us with enough evidence that we aren’t in the null world (or, in more statsy terms, enough evidence to reject the null hypothesis). There are lots of possible thresholds. By convention, most people use a threshold (often shortened to α) of 0.05, or 5%. But that’s not required! You could have a lower standard with an α of 0.1 (10%), or a higher standard with an α of 0.01 (1%). Statistically significant The p-value is < 0.001 and our threshold for α is 0.05 In a world where there is no relationship between x and y, the probability of seeing a slope of at least 0.901 is < 0.1% Since < 0.001 is less than 0.05, we have enough evidence to say that the slope is statistically significant.
Evidentiary standards When thinking about p-values and thresholds, I like to imagine myself as a judge or a member of a jury. Many legal systems around the world have formal evidentiary thresholds or standards of proof. If prosecutors provide evidence that meets a threshold (i.e. goes beyond a reasonable doubt, or shows evidence on a balance of probabilities), the judge or jury can rule guilty. If there’s not enough evidence to clear the standard or threshold, the judge or jury has to rule not guilty. With p-values: If the probability of seeing an effect or difference (or δ) in a null world is less than 5% (or whatever the threshold is), we rule it statistically significant and say that the difference does not fit in that world. We’re pretty confident that it’s not zero. If the p-value is larger than the threshold, we do not have enough evidence to claim that δ doesn’t come from a world of where there’s no difference. We don’t know if it’s not zero. Importantly, if the difference is not significant, that does not mean that there is no difference. It just means that we can’t detect one if there is. If a prosecutor doesn’t provide sufficient evidence to clear a standard or threshold, it does not mean that the defendant didn’t do whatever they’re charged with†—it means that the judge or jury can’t detect guilt.
I just whipped up this little #QuartoPub site last week that demonstrates how I teach p-values/hyp-testing through simulation both with live OJS and with #rstats, and I think it's super neat! It has examples for diff-in-means, diff-in-props, and regression slopes nullworlds.andrewheiss.com #statsky
Malaysia’s R community is growing! 🇲🇾 From a small network into a platform that actively connects students, researchers, and industry practitioners
r-consortium.org/posts/bringi...
#rstats #opensource #datascience #Malaysia #Shiny #tidyverse #community #analytics
A screenshot of an interactive map of all medals won so far at the 2026 Winter Olympics by place of birth of the winners
Winter Olympics 2026 medalists by place of birth: an interactive map I built (again) thanks to @wikipedia, @wikidata and #rstats).
Check out the interactive version of the map: https://giocomai.github.io/olympics2026nuts/medalists_map.html
📈 A while back I did promise to put a post together how I generate publication-ready figures. Before I am off to #CNY2026, I finally found the time. May this be useful to some...
Also I am curious to hear what other tricks are out there.
jaquent.github.io/2026/02/crea...
#rstats #ggplot #dataviz
A hexagon R package logo, with the package name “whistledown” in a calligraphy-style font. A silhouette of a quill representing Lady Whistledown’s letters, and three bees for the Bridgerton family crest.
Dearest Gentle Reader,
I’m happy to announce the release of my new R package, “whistledown”, with color palettes from the hit show #Bridgerton!
#RStats #ggplot
I’m thrilled to introduce flownet (sebkrantz.github.io/flownet/), a new R package for transport modeling, supporting stochastic or deterministic traffic assignment to large networks, and powerful tools for (multimodal) network processing/simplification: sebkrantz.github.io/Rblog/2026/0... #Rstats
Data Science Weekly - Issue 637, by @DataSciNews open.substack.com/pub/datascie...
On a positive note, here's a new blog post highlighting some polyglot data science tools in R and Python that I've enjoyed lately
#rstats #pydata
www.practicalsignificance.com/posts/favori...
The first data science book that has a chapter on monads reproducible-data-science.dev
Learn how to build robust #DataScience pipelines with #RStats, #Python , #Julia and #Nix !
Are you a #Stata user? Maybe you work with one?
Have you ever found yourself copy-pasting from the results window?
It's annoying as hell! And terrible practice. So I wrote a blog post on using #rstats to extract results from Stata log files
benharrap.com/post/2026-02...
On this page What’s the difference between statistical significance and substantial significance? Can we measure substantial significance with statistics? What are all the different ways we can look at model coefficients? Print the object name Use summary() Use tidy() from the {broom} package Use model_parameters() and model_details() from the {parameters} and {performance} packages Make nice polished side-by-side regression tables with {modelsummary} Make automatic coefficient plots with modelplot() from {modelsummary} Plot model predictions and marginal effects Automatic interpretation with {report}
Posted a helpful little set of FAQs about regression for my causal inference class, including illustrations of statistical vs. substantive signficance and all the different things you can do with #rstats model objects
evalsp26.classes.andrewheiss.com/news/2026-02...