Secure record linkage of large health data sets: Evaluation of a hybrid cloud model
This 2020 paper published in JMIR Medical Informatics examines both the usefulness of linked data (connecting data points across multiple data sets) for investigating health and social issues, and a cloud-based means for enabling linked data. Brown and Randall briefly review the state of cloud computing in general, and in particular with relation to data linking, noting a dearth of research on the practical use of secure, privacy-respecting cloud computing technologies for record linkage. Then the authors describe their own attempt to demonstrate a cloud-based model for record linkage that respects data privacy and integrity requirements, using three synthetically generated data sets of varying sizes and complexities as test data. They discuss their findings and then conclude that through the use of “privacy-preserving record linkage” methods over the cloud, data “privacy is maintained while taking advantage of the considerable scalability offered by cloud solutions,” all while having “the ability to process increasingly larger data sets without impacting data release protocols and individual patient privacy” policies.