Sign up for our newsletter and get the latest big data news and analysis.

You and Your Cellphone: Doing Your Part for Big Data

That cellphone in your pocket or purse is generating data that, for better or worse, can be used for a variety of applications – everything from urban planning to tracking your whereabouts.

Larry Hardesty, writing in a release issued by the MIT News Office, comments that today’s sensor-studded cellphones can be used for a variety of socially useful applications such as epidemiology, operations research and emergency preparedness, just to name a few.

So far, so good. But here’s the catch – before releasing the data to researchers in these fields, information identifying the individual user needs to be removed. Asks Hardesty, “…how hard could it be to protect the identity of one unnamed cellphone user in a data set of hundreds of thousands or even millions.”

Turns out assuring that level of privacy is very hard indeed.

According to a paper appearing this week in Scientific Reports, harder than you might think,” Hardesty writes. “Researchers at MIT and the Université Catholique de Louvain, in Belgium, analyzed data on 1.5 million cellphone users in a small European country over a span of 15 months and found that just four points of reference, with fairly low spatial and temporal resolution, was enough to uniquely identify 95 percent of them. In other words, to extract the complete location information for a single person from an ‘anonymized’ data set of more than a million people, all you would need to do is place him or her within a couple of hundred yards of a cellphone transmitter, sometime over the course of an hour, four times in one year. A few Twitter posts would probably provide all the information you needed, if they contained specific information about the person’s whereabouts.”

The Scientific Reports paper speculate that the concepts behind tracking people’s movements using cellphone data might apply to other kinds of data as well – for example web browsing. As César Hidalgo, one of the paper’s authors comments, “The space of potential combinations is really large. When a person is, in some sense, being expressed in a space in which the total number of combinations is huge, the probability that two people would have the same exact trajectory — whether it’s walking or browsing — is almost nil.”

Read the Full Story.

Resource Links: