The issue emerged because journals and funders increasingly require researchers to publish the code they have used to analyse large datasets. When intending to upload code, some researchers have also accidentally published partial or entire Biobank datasets to GitHub, a popular online code-sharing platform. UK Biobank prohibits researchers from sharing data outside their systems and says it has introduced further training for all researchers. In the past year, the data leaks appear to have become a more urgent concern to UK Biobank. Between July and December 2025, it issued 80 legal notices to GitHub, which has complied with requests to remove data from the internet. Yet much still remains available.
Confidential health records from UK BioBank project exposed online www.theguardian.com/science/2026...
@ukbiobank.bsky.social statement www.ukbiobank.ac.uk/news/a-messa...
Article highlights potential for re-identification in datasets shared by researchers
#patientdata #healthdata #dataprotection