fbpx
Reidentification Risk of Masked Datasets: Part 1

04

September, 2020

When it comes to securing data, companies often find themselves at a crossroads – ensure data security, which negatively affects data functionality or, compromise security to keep data functionality intact. In efforts to overcome this issue of having to make a trade-off, organizations adopt sophisticated methods of data protection, such as anonymization and masking, to ensure data security in regard to functionality and performance.

There is a catch, however, and that’s the issue of data reidentification.

Cross-referencing the data with other publicly available data can reidentify an individual from their metadata. As a result, private information such as PFI, PHI, and contact information could end up in the public domain. In the wrong hands, this could be catastrophic.

Research conducted by the Imperial College London found that “once bought, the data can often be reverse-engineered using machine learning to reidentify individuals, despite the anonymization techniques. This could expose sensitive information about personally identified individuals and allow buyers to build increasingly comprehensive personal profiles of individuals. The research demonstrates for the first time how easily and accurately this can be done — even with incomplete datasets. In the research, 99.98% of Americans were correctly reidentified in any available ‘anonymized’ dataset by using just 15 characteristics, including age, gender and marital status.”

There have been many incidents in which this has already happened, such as the NYC taxicab debacle or the Netflix Prize dataset contest, where seemingly anonymized data were easily reidentified.

So, it’s not just about using the right data security technology, but also about implementing it right.

While you fight between ensuring both data security and data functionality, you find yourself in a bind, choosing what to trade for the other. But does it have to be this way?

And most importantly, what do you really need to focus on while anonymizing data?

To find out, go to Forbes Tech Council – Reidentification Risk of Masked Datasets: Part 1 to read the entire article.

Related Blogs

Data Security Challenges in Retail Industry

Data Security Challenges in Retail Industry

Goods and services are increasingly being sold online. Learn how retailers can protect themselves from cyber crime.
Data Minimization (GDPR): Are you doing it right?

Data Minimization (GDPR): Are you doing it right?

The only way to be doing data minimization. Read more to find out.
Everything you need to know about the GDPR

Everything you need to know about the GDPR

An in-detail look into the current state of GDPR and the future of data protection.
Reidentification Risk of Masked Datasets: Part 1

Reidentification Risk of Masked Datasets: Part 1

Find out how to overcome the risk data reidentification and where to focus your efforts when it comes to anonymizing your data.