Anonymisation: False assumptions and fallacies in data protection and privacy


In today’s world awash with affordable data storage and processing, “Big Data” has emerged as a powerful approach to optimize decision making, with uses across varying fields.

One is the health sciences, where complex and highly dimensional health data combined with behavorial and environmental data are transformed into predictions for more effective patient diagnosis. Another is in financial services, where historical analysis of spending patterns are used to uncover anomalies to highlight potential fraudulent transactions.

Others include education, marketing, transportation, and even sports.

Developments in Big Data innovations inevitably triggers the debate: how to preserve personal data privacy and yet benefit from the data utility?

For many, the logical solution is embracing the seemingly reliable “anonymization” process to protect privacy. We expect that removing (or making small changes to) personal data protect our privacy.

Personally identifiable information (PII)

But advances in technologies mean that the common understanding of “personally identifiable information” (PII) or “personal data”, centering on the obvious such as names, birthdates, easily misses other information not immediately seen as personally identifiable. One example is IP address.

Google on its blog, “Are IP addresses personal?” argued that not only these number strings shared in some situations, they are also tied to machines and not humans…Click here to read full article.