Challenges to Creating a Statewide Medical Database
A new article describes the opportunities and barriers to sharing medical data in real time with a statewide central database.
In 2014, the California Department of Public Health began work on a pilot project with a local health system to send pathology cancer data directly to a statewide central database, the California Cancer Registry. The first-of-its-kind initiative allowed for real-time monitoring activities on cancer data from 10 hospitals within St. Joseph Health. Previously, the CCR had a two-year lag time in receiving this kind of information.
The California initiative sheds light on the possibilities of big data to better understand cancer trends, but also brings up issues of protecting patient privacy. A new article in The American Journal of Managed Care discusses both the opportunities and barriers to sharing medical data in real time. The paper was published August 14.
“Privacy and security concerns are a significant barrier for the utilization of big data and sharable data,” said first author Y. Tony Yang, a professor and health services and policy researcher at the George Washington University School of Nursing. “In addition to legal privacy concerns, some consumers may feel uncomfortable about their data being stored and accessed by multiple users, even if it is de-identifiable.”
Registries like the CCR offer physicians and researchers the ability to monitor the number of new cancer cases and deaths over time, measure the success of cancer screening programs, and conduct studies to find the causes and cures of cancer. Providers can use the standardized data to help determine treatment plans for their patients, including recommending clinical trials.
However, Yang emphasizes that certain technical, legal and institutional barriers exist that must be taken into account before launching a real-time registry. With regard to data privacy, the ability to collect, disclose and share public health data varies by state. Some states have stronger privacy protections than those required under the Health Insurance Portability and Accountability Act (HIPAA).
“Data sharing is regulated by several federal and state laws and how the law applies depends on what is in the data, if it is identifiable, and the context in which the data is to be used,” said Yang.
Also, questions remain about whether de-identified data are sufficiently de-identified. Yang cited the example of Latanya Sweeney, director of the Data Privacy Lab at Harvard University, who successfully linked names and contact information to publicly available DNA data profiles in the Personal Genome Project. By linking demographics to public records such as voter lists, and mining for names hidden in attached documents, Sweeney and her colleagues correctly identified 84 to 97 percent of the profiles.
Atul Butte, the Priscilla Chan and Mark Zuckerberg Distinguished Professor and inaugural director of the Institute for Computational Health Sciences at the University of California, San Francisco, thought the article on the California Cancer Registry lacks sufficient information about successful approaches that address privacy concerns about medical data.
“It doesn’t seem to be very high impact, and a lot of it seems opinion-based,” said Butte, who was not involved in writing the article.
Despite HIPAA, he noted that there are many proper ways that physicians and researchers are allowed to share data. For instance, the OneFlorida Clinical Research Consortium has created the OneFlorida Data Trust, a repository of statewide healthcare data that is regularly updated. The legal agreements for data use have already been negotiated with all partners, so researchers can freely dive in and work with the de-identified patient-level health data.
The CCR has its own stringent policies with regards to patient privacy. It only releases patient contact information to qualified researchers under tightly controlled circumstances, and all research proposals must be first reviewed by the Committee for the Protection of Human Subjects.
Previously, only about 5 percent of diagnostic data were sent in real-time to the CCR, but California health officials hope that number will increase to 65 percent by 2022 as a result of the new initiative.