Not necessarily about writing or publishing, but an interesting 21st-century issue.
From Legal Tech News
As more algorithm-coded technology comes to market, the debate over how individuals’ de-identified data is being used continues to grow.
A class action lawsuit filed in a Chicago federal court last month highlights the use of sensitive de-identified data for commercial means. Plaintiffs represented by law firm Edelson allege the University of Chicago Medical Center gave Google the electronic health records (EHR) of nearly all of its patients from 2009 to 2016, with which Google would create products. The EHR, which is a digital version of a patient’s paper chart, includes a patient’s height, weight, vital signs and medical procedure and illness history.
While the hospital asserted it did de-identify data, Edelson claims the hospital included date and time stamps and “copious” free-text medical notes that, combined with Google’s other massive troves of data, could easily identify patients, in noncompliance with the Health Insurance Portability and Accountability Act (HIPAA).
. . . .
“I think the biggest concern is the quantity of information Google has about individuals and its ability to reidentify information, and this gray area of if HIPAA permits it if it was fully de-identified,” said Fox Rothschild partner Elizabeth Litten.
Litten noted that transferring such data to Google, which has a host of information collected from other services, makes labeling data “de-identified” risky in that instance. “I would want to be very careful with who I share my de-identified data with, [or] share information with someone that doesn’t have access to a lot of information. Or [ensure] in the near future the data isn’t accessed by a bigger company and made identifiable in the future,” she explained.
If the data can be reidentified, it may also fall under the scope of the European Union’s General Data Protection Regulation (GDPR) or California’s upcoming data privacy law, noted Cogent Law Group associate Miles Vaughn.
Link to the rest at Legal Tech News
De-identified data is presently an important component in the development of artificial intelligence systems.
As PG understands it, a large mass of data concerning almost anything, but certainly including data about human behavior, is dumped into a powerful computer which is tasked with discerning patterns and relationships within the data.
The more data regarding individuals that goes into the AI hopper, the more can be learned about groups of individuals and relationships between individuals or behavior patterns of individuals that may not be generally known or discoverable by other, more traditional methods of data analysis and the resultant learning such analysis generates.
As a crude example based upon the brief description in the OP, an artificially intelligent system that had access to the medical records described in the OP and also the usage records for individuals using Ventra cards (contactless digital payment cards that are electronically scanned) on the Chicago Transit Authority could conceivably identify a specific individual associated with an anonymous medical record by correlating Ventra card use at a nearby transit stop with the time stamps on the digital medical record entries.