Martin Modrák (@modrakm)

To my health researcher colleagues: when you're looking at routinely collected data (over many years) and you're interested in the relationship b/w two kinds of events (say, diagnosis for diseases A and B), how do you best deal with time gaps (eg people moving)? Obviously these introduce time bias >

06.03.2026 09:24 👍 8 🔁 4 💬 2 📌 0

I am quite suspicious of the popularity of post-data multiverse analysis. If we can have all those specifications identified, why not run corresponding simulations ahead of data collection to pressure-test the experimental design & analytical procedures so we can perform the most informative study?

05.03.2026 16:00 👍 14 🔁 3 💬 2 📌 0

a close up of a man 's face in a car with a woman behind him . Alt: That's bait meme gif

04.03.2026 08:48 👍 1 🔁 0 💬 0 📌 0

1906: "Books have become the modern narcotic." Novels were compared to opium, alcohol and cigarettes and parents wanted restrictions on children being able to read them.

03.03.2026 17:20 👍 283 🔁 38 💬 1 📌 8

Books about jazz were blamed for making children depressed and causing suicide and brain rot ("lop-sided brains"), they wanted to ban children under 16 from reading them.

03.03.2026 17:16 👍 318 🔁 44 💬 1 📌 3

1908: the Lancet, one of the most respected scientific journals, calls for 18 age limit on reading in bed amidst a moral panic surrounding children becoming "addicted" to novels, which were "designed to keep kids hooked" and destroy their attention/mental health

03.03.2026 17:13 👍 2394 🔁 863 💬 3 📌 148

Ha! The original Lancet article on the dangers of reading in bed is here: doi.org/10.1016/S014...

03.03.2026 18:18 👍 127 🔁 67 💬 6 📌 11

I'd argue that this is a different case: the Clopper-Pearson interval does not fail to meet its coverage guarantees, it is "just" conservative. In my example, there is a strong failure: coverage of 95% freq CIs of glm.nb is sometimes lower than 80%.

03.03.2026 12:02 👍 2 🔁 0 💬 0 📌 0

The fallacy of placing confidence in confidence intervals - Psychonomic Bulletin & Review Interval estimates – estimates of parameters that include an allowance for sampling uncertainty – have long been touted as a key component of statistical analyses. There are several kinds of interval ...

A completely different and highly illuminating example is the submarine problem - I like the take on it by @richarddmorey.bsky.social in link.springer.com/article/10.3...

03.03.2026 11:57 👍 7 🔁 2 💬 2 📌 0

Using Bayesian tools to be a better frequentist

I have an example where Bayesian computation gives better frequentist properties than the freq. solution: www.martinmodrak.cz/2025/07/09/u...
(This is a simple example of a more general thing: common methids of Bayesian computation tend to work also in small samples, but freq may not)

03.03.2026 11:44 👍 12 🔁 1 💬 2 📌 0

Should everyone be taking statins?

New episode of HARD DRUGS!

Should everyone be taking statins?

Statins have revolutionised heart disease and they're one of many reasons for the long-term decline in cardiovascular mortality.

27.02.2026 20:20 👍 53 🔁 8 💬 4 📌 6

All of the longitudinal techniques apply (e.g. AR/MA stuff). Using correlated random effects seems close to a random walk prior on the process (though it's late here and I am not sure I have it 100% straight) and so makes some sense, but being explicit about time sounds preferable to me. 2/2

01.03.2026 20:05 👍 1 🔁 0 💬 0 📌 0

Sorry, misunderstood. Modelling separately could still make sense (if your question is roughly "is there a longer term change that is not explained just by the short-term change", I'd bet causal people have some **ATE shortcut name for this). Otherwise, you have a short longitudinal data. 1/2

01.03.2026 20:05 👍 0 🔁 0 💬 1 📌 0

#rstats #statssky #brms
To analyse ordinal data collected in two conditions (retrospective baseline vs current) and two groups (control v treatment) can I use a multivariate latent model?

fit_joint <- brm(
mvbind(pre, post) ~ group + (1|p|id),
data = dat,
family = cumulative("logit")
)

01.03.2026 11:44 👍 9 🔁 3 💬 3 📌 0

Controlling for baseline in logistic regresssion (binomial outcome) I have a binomial outcome measured pre and post intervention (number of successes in task, number of trials is constant). I’d like to control for the baseline in my model. It seems to me that using ...

I've had similar thoughts when analyzing binomial data, some relevant duscussion: discourse.datamethods.org/t/controllin...
I'd think that rather than multivariate model, a measurement error model makes a bit more sense (though brms won't support ordinal measurement error out of the box).

01.03.2026 14:56 👍 1 🔁 0 💬 1 📌 0

I had pretty good experience with Plausible as a replacement for GA. I don't track their development too closely, but I'd expect them to be decent at not counting bots while less likely to be blocked by users, as they track without cookies and are very privacy friendly... But YMMV.

28.02.2026 06:34 👍 1 🔁 0 💬 1 📌 0

Thinking of moving from MCMC to INLA, apparently I have nothing to lose but my chains.

23.02.2026 23:01 👍 7 🔁 1 💬 1 📌 2

Can anyone fix science? Science has always been in crisis. This is fine.

Part one of a new blog series: using the discovery of vitamins as a parable for why replication crises in science are actually good.

24.02.2026 15:29 👍 33 🔁 7 💬 3 📌 5

I will totally eat a grapefruit. Though I admit that I prefer to be a grapefruit vampire most of the time. Though I just ate a bit of cantaloupe and it seemed quite OK to me, so maybe it's me, I'm the problem?

23.02.2026 13:39 👍 1 🔁 0 💬 0 📌 0

Using Bayesian tools to be a better frequentist

Also good to remember that confidence intervals are almost always approximations holding in infinite data limit and you can get better frequentist behavior from Bayesian MCMC in finite samples. www.martinmodrak.cz/2025/07/09/u...

20.02.2026 08:04 👍 3 🔁 0 💬 0 📌 0

Do y'all know of any RCTs (or other randomized experiments) with an interesting nominal* outcome?

*In this case, I'm not including binary or ordinal outcomes as "nominal." Rather, here I'm looking for 3+ unordered categories.

19.02.2026 19:22 👍 7 🔁 4 💬 5 📌 0

I just did the dumbest thing of my entire career to prove a much more serious point.

I tricked ChatGPT and Google, and made them tell other users I’m a competitive hot-dog-eating world champion

People are using this trick on a massive scale to make AI tell you lies. I’ll explain how I did it

18.02.2026 16:37 👍 4817 🔁 2131 💬 86 📌 298

Devezer's Urn LLMs make metascience easier, but that doesn't increase metascientific validity.

LLMs make statistical metascience easier. LLMs don't increase the validity of statistical metascience. www.argmin.net/p/devezers-urn

18.02.2026 15:28 👍 51 🔁 10 💬 1 📌 5

StanCon 2026 Oral presentation deadline (25th of February) Hi all, Just a quick reminder for StanCon 2026 (Uppsala, 17–21 August): if you’re planning to give an oral presentation, the submission deadline is 25 February (AoE). If you have a talk in mind, plea...

🔥🔥 StanCon 2026 Oral Presentation Submission Deadline on the 25th of February 🔥🔥

discourse.mc-stan.org/t/stancon-20...

18.02.2026 18:23 👍 6 🔁 4 💬 1 📌 0

Totes agree, I think the point is that clutter (and other things) will tend to make the benefits of transparency as a science-wide policy lower than what you would expect if you just look at the benefits science gained from early adopters of the practice.

18.02.2026 08:06 👍 1 🔁 0 💬 0 📌 0

Risk compensation - Wikipedia

To me the new angle is the reminder that the average effect of any improvement in methods will likely diminish over time as its use spreads out from thoughtful/careful core users to the whole population of scientists. Kind of like en.wikipedia.org/wiki/Risk_co... . Maybe its not new to others.

18.02.2026 08:04 👍 2 🔁 0 💬 0 📌 0

Making research data and other intermediate research outputs more freely available should accelerate the production of knowledge and improve quality through error detection (both of which are demonstrably at least partly true). But now we are seeing readily available data assets, such as GWAS summary statistics and observational cohort study data from the UK Biobank, NHANES, and many other databases, being weaponized to generate manuscripts with minimal scholarly value and utility (or even negative utility given the burden they place on the system). In many ways, the story of MR recapitulated what had happened previously with meta-analysis. The aggregation of existing data from combinable randomized trials led to many important clinical and public health advances, and to less research waste and better science going forward. However, the combination of using data that could be extracted from existing publications with relatively simple statistical analyses led to an explosion of publications of decreasing and then negative scientific value. Thus, despite best intentions, efforts to improve research rigor, such as by making data available, synthesizing evidence, and improving causal inference methods, have become commodified by authors and paper mills to increase the marketability of a manuscript, rather than to improve the quality of scholarship.

Shots fired. Probably relevant to the discussion we had with
@devezer.bsky.social and @joachim.cidlab.com about transparency vs. clutter.
journals.plos.org/plosbiology/...

17.02.2026 10:53 👍 11 🔁 1 💬 2 📌 0

Problem about the loneliness epidemic is, it's everywhere except in representative survey data. Let's look at where the claim comes from. 1/

17.02.2026 07:13 👍 596 🔁 226 💬 21 📌 35

I'd say you demand that you get the correct coverage of posterior intervals averaged over the prior... (and typically you also get good frequentist coverage for all values with decent prior orob, but that’s not guaranteed).

16.02.2026 19:43 👍 0 🔁 0 💬 0 📌 0

Railways were a big deal, but there was a railway bubble. Partly fueled by low interest rates even! en.wikipedia.org/wiki/Railway...

16.02.2026 14:17 👍 1 🔁 0 💬 0 📌 0

Martin Modrák

Latest posts by Martin Modrák @modrakm