Dylan Pieper (@dylanpieper)

I forgot to mention that you can now import your REDCap audit logs. 🤓 It defaults to the past 7 days.

01.12.2025 21:07 👍 0 🔁 0 💬 0 📌 0

Transfer REDCap Data to Database Transfer REDCap (Research Electronic Data Capture) data to a database, specifically optimized for DuckDB. Processes data in chunks to handle large datasets without exceeding available memory. Features...

🦆 📦 redquack 0.3.0 now includes a suite of convenience functions, and more importantly, full control over labeling variables and coded values without separate imports.

This makes redquack a great tool for large or small REDCap projects.

➡️ dylanpieper.github.io/redquack/

#rstats #redcap

01.12.2025 20:58 👍 5 🔁 0 💬 1 📌 0

**course material for adults who swear**

15.08.2025 19:05 👍 0 🔁 0 💬 0 📌 0

Stretching DuckDB w/ Common Crawl, ~1.7B rows, ~300 parquet files. ~2-3s for single-column aggregations, ~2-3 mins to SUMMARIZE the data, peaking at ~12-14GB memory usage. Not exactly real-time, but the fact you can do this on a laptop with no server setups or Spark pipelines is still amazing.

15.08.2025 03:10 👍 44 🔁 9 💬 1 📌 1

a colorful background with the words " the more you know " and a star ALT: a colorful background with the words " the more you know " and a star

A little known fact is that RStudio rendering is powered by users’ electromagnetic fields (i.e., “good vibes”) and the exodus to Positron has severely limited its ability to compile code. #rstats

05.08.2025 11:48 👍 9 🔁 1 💬 1 📌 0

Remember this #rstats post? I wasn't the only one talking about it & the tidyverse team was listening 😎 #databs

New #dplyr functions? They're looking for feedback!!
🤔 replace_when, recode_values, replace_values

👀 Read this:
github.com/tidyverse/ti...

🗣️ Comment on PR:
github.com/tidyverse/ti...

04.08.2025 17:30 👍 62 🔁 15 💬 2 📌 2

We Need to Talk About Pedocon Theory The connection between Donald Trump and Jeffrey Epstein is no accident, but reveals a deep logic at the heart of reactionary politics.

I think pedocon theory is right. It’s empirically adequate, parsimonious, fits within a broader theoretical framework, and has immense explanatory breadth and depth www.liberalcurrents.com/we-need-to-t...

29.07.2025 11:10 👍 73 🔁 27 💬 2 📌 1

I am such a sucker for frivolous uses of AI. Here's an anthem for the tidyverse: suno.com/s/iVMVs4IoyA...

11.07.2025 20:35 👍 8 🔁 1 💬 2 📌 1

Modules + Claude code for simple but labor intensive edits across files

12.07.2025 11:28 👍 0 🔁 0 💬 0 📌 0

Very cool to see authors of this article mentioning the importance of sharing project-, data-, AND variable-level documentation alongside data in a repository, and linking to the templates I've provided on OSF as an example! 🌟

doi.org/10.1515/ling...

08.07.2025 19:05 👍 28 🔁 8 💬 2 📌 0

Tbh I relate to that big yellow spike of with mad uncertainty around age 30. 🤣

08.07.2025 12:27 👍 1 🔁 0 💬 0 📌 0

As a data manager, good documentation not only helps me do my job better, but also helps me annoy you less! 😅

Good documentation about inclusion criteria, READMEs about oddities in the data, consort diagrams and tracking to explain missing data, and so on, are all ways to ensure I bug you less! 🐛🐜🐝

27.06.2025 17:46 👍 22 🔁 2 💬 0 📌 1

I think if you’re curious and truly care about problem solving you might have a ~temporary~ feeling of closure or a premature commit. But you will keep iterating (opening/closing) as you explore the problem space and how it works, validate the throughput, and improve the methods. Stay curious!

19.06.2025 13:57 👍 2 🔁 0 💬 0 📌 0

Pitfalls of premature closure with LLM assisted coding When LLM models generates clean, professional-looking code, it's tempting to stop exploring alternatives. But therein lies the risks that comes with premature closure. So what is premature closure?

New to me is the term "premature closure", where you too quickly latch on to the first solution you see. Always a danger in coding, but particularly so today when LLMs can give you a plausible fix so so quickly.

www.shayon.dev/post/2025/16...

18.06.2025 14:17 👍 98 🔁 14 💬 7 📌 4

I would use cosine in stringdist. If you have lists of job descriptions (from two sources with each idx being a similar job), you can use my package samesies. dylanpieper.github.io/samesies/

19.06.2025 11:11 👍 3 🔁 0 💬 1 📌 0

So cool! Any intros or docs planned for helping people familiar with the futureverse make the leap to marai?

14.06.2025 10:56 👍 0 🔁 0 💬 0 📌 0

Functional Programming Tools A complete and consistent functional programming toolkit for R.

Bleeding edge update for the #tidyverse purrr package with even more seamless #rstats parallel maps.

Introducing our shiniest new adverb: `in_parallel()`. Just wrap your function to take advantage of blazing fast parallel processing via mirai.

pak::pak("tidyverse/purrr")

purrr.tidyverse.org/dev/

13.06.2025 15:32 👍 103 🔁 32 💬 6 📌 1

Prior Predictive Checks with marginaleffects and brms – Vincent Arel-Bundock

One cool thing you can/should do is sample from priors only, and plot the distribution of the actual quantity of interest (ex: risk ratio). I find this very useful. This is actually super easy with brms. arelbundock.com/posts/margin...

12.06.2025 21:52 👍 19 🔁 1 💬 1 📌 0

Engineers Shouldn’t Write ETL: A Guide to Building a High Functioning Data Science Department | Stitch Fix Technology – Multithreaded “What is the relationship like between your team and the data scientists?” This is, without a doubt, the question I’m most frequently asked when conducting i...

This blog post about engineering not doing ETL is nine years old… it’s worth reviewing

multithreaded.stitchfix.com/blog/2016/03...

11.06.2025 01:57 👍 19 🔁 4 💬 3 📌 0

The worst is when you write in active voice and then someone tries to edit all of it back into passive. Old habits die hard and the good fight continues.

04.06.2025 12:41 👍 1 🔁 0 💬 1 📌 0

You could use surveydown and provide the LLM with the package docs

31.05.2025 13:21 👍 1 🔁 0 💬 1 📌 0

$screenshot of a code editor showing the following R code: library(quantmod) library(ggplot2) library(lubridate) startYear <- 2015 startDate <- paste0(startYear, '-01-01') getSymbols(c('spy', 'btc-usd'), from= startDate) # function factory that creates a scale function that only shows valid years. # try to keep code that could change in here! make_valid_year_scale_function <- function(start_year){ function(){ list( scale_x_continuous(breaks = seq(start_year, Sys.Date() |> year(), 1)), theme(panel.grid.minor.x = element_blank()) # use function after other theme funcs ) } } # this makes it so I can add scale_x_valid_years() to any plot scale_x_valid_years <- make_valid_year_scale_function(startYear)$

screenshot of a code editor showing the following R code: library(quantmod) library(ggplot2) library(lubridate) startYear <- 2015 startDate <- paste0(startYear, '-01-01') getSymbols(c('spy', 'btc-usd'), from= startDate) # function factory that creates a scale function that only shows valid years. # try to keep code that could change in here! make_valid_year_scale_function <- function(start_year){ function(){ list( scale_x_continuous(breaks = seq(start_year, Sys.Date() |> year(), 1)), theme(panel.grid.minor.x = element_blank()) # use function after other theme funcs ) } } # this makes it so I can add scale_x_valid_years() to any plot scale_x_valid_years <- make_valid_year_scale_function(startYear)

Here's a functional programming trick for #rstats that I wish I started using sooner:

if you need a #ggplot2 scale to be reusable across multiple plots and dynamically configurable without relying on global state, consider using a function factory (a function that returns a function) to build it

29.05.2025 23:36 👍 35 🔁 5 💬 5 📌 0

shikokuchuo{net}: mirai 2.3.0 Advancing Async Computing in R

mirai - minimalist async framework for #RStats - released as an 'r-lib' package.

Blog post: Advancing Async Computing in R.
shikokuchuo.net/posts/26-mir...

mirai provides event-driven async for #RShiny and parallel processing for purrr #tidyverse.

Really excited to be working on this at Posit!

23.05.2025 14:11 👍 63 🔁 18 💬 0 📌 0

Restoring Gold Standard Science By the authority vested in me as President by the Constitution and the laws of the United States of America, including section 7301 of title 5, United

tl;dr — this EO co-opts the language of open science to implement a system of political control wherein presidential appointees are given broad latitude to designate any number of reasonable scientific activities and inferences as scientific misconduct, and to penalize those involved accordingly.

24.05.2025 21:27 👍 2469 🔁 1059 💬 101 📌 106

There's so much polarization around LLMs. They are way overhyped, I agree. But I also use them semi-regularly now.

Here's a thread of genuine use cases where I find them helpful. Please add your own!

20.05.2025 19:51 👍 91 🔁 23 💬 7 📌 13

Introducing {shinyfa}: Analyze Large Shiny App Codebases Faster with This R Package | Daly Analytics Discover {shinyfa}, a new R package designed to improve developer experience by analyzing and summarizing the structure of large Shiny applications. Perfect for consultants, teams, and contributors wo...

📦 I’m excited to share a new #rstats package I’ve been working on: {shinyfa} built to help folks working on large or unfamiliar #rshiny apps ✨

The package scans your app folders and extracts out details on render*(), reactive() and input$ to a dataframe!

📖 www.dalyanalytics.com/blog/shinyfa...

19.05.2025 13:47 👍 12 🔁 2 💬 1 📌 0

Playing around with satellite imagery of #madison to make some office art. #Rstats

18.05.2025 13:24 👍 5 🔁 1 💬 0 📌 0

✨Use llms from #rstats with ellmer ✨Version 0.2.0 is on CRAN now. No blog post yet because I'm about to go on vacation, but in the meantime you can check out the release notes: github.com/tidyverse/el....

18.05.2025 14:13 👍 67 🔁 13 💬 3 📌 0

The kind of Friday morning content I needed to see. ❤️

16.05.2025 11:35 👍 14 🔁 1 💬 0 📌 0

Text: posit conf 2024 Virtual Tickets Available, Atlanta, September 16-18. A drawing outline of the Atlanta skyline and abstract cubes.

Registration for the posit::conf(2025) virtual experience is now open!

Join us virtually, Sept 16–18, and access live-streamed keynotes and 100+ talks, on-demand recordings, Q&A sessions, and our virtual networking platform.

Learn more in the blog post: posit.co/blog/posit-c...

#RStats #Python

15.05.2025 14:59 👍 19 🔁 15 💬 1 📌 2

Dylan Pieper

Latest posts by Dylan Pieper @dylanpieper