Researchers De-Anonymize Dataset people
Image: Images: Depositphotos

Various companies collect data from our devices almost all the time. While there is always a privacy concern in the picture, they try to assure that our data is in completely safe hands. Also, if it gets shared with third-parties, all the information that could be used to identify people is redacted and de-identified.

Turns out the techniques used to anonymize data aren’t that fool-proof, according to researchers at Imperial College London who have published a paper on reverse engineering incomplete datasets.

The researchers developed a machine learning model that can reverse-engineer an incomplete dataset. Using 15 demographic attributes such as age, gender, marital status, etc. they were able to re-identify almost 99.98% Americans in an anonymized dataset.

For that purpose, the researchers used 210 different datasets covering a “large range of uniqueness.” It includes information on around 11 million Americans.

However, the goal of the study isn’t to establish the fact that the so-called “anonymous” datasets can be deanonymized. It was already done in the past at DEFCON 2018, where hackers were able to legally get hold of the browsing history of 3 million Germans, and de-anonymize them.

Researchers have made an attempt to prove how easy it has become to fool the techniques used to make the datasets. It invites a call to action for governments and companies to implement even robust techniques that can keep people’s identities secure.

They have also set up a website where you can check how easy it is to identify you in an anonymous dataset.

Also Read: VLC Media Player Has Critical Security Flaw: Uninstall Now!
Aditya Tiwari
Aditya likes to cover topics related to Microsoft, Windows 10, and interesting gadgets. But when he is not working, you can find him binge-watching random videos on YouTube (after he has wasted an hour on Netflix trying to find a good show). Reach out at [email protected]