Cambridge Analytica demonstrates that Facebook needs to give researchers more access.
In a 2013 paper, psychologist Michal Kosinski and collaborators from University of Cambridge in the United Kingdom warned that “the predictability of individual attributes from digital records of behavior may have considerable negative implications,” posing a threat to “well-being, freedom, or even life.” This warning followed their striking findings about how accurately the personal attributes of a person (from political leanings to intelligence to sexual orientation) could be inferred from nothing but their Facebook likes. Kosinski and his colleagues had access to this information through the voluntary participation of the Facebook users by offering them the results of a personality quiz, a method that can drive viral engagement. Of course, one person’s warning may be another’s inspiration.
Kosinski’s original research really was an important scientific finding. The paper has been cited more than 1,000 times and the dataset has spawned many other studies. But the potential uses for it go far beyond academic research. In the past few days, the Guardian and the New York Times have published a number of new stories about Cambridge Analytica, the data mining and analytics firm best known for aiding President Trump’s campaign and the pro-Brexit campaign. This trove of reporting shows how Cambridge Analytica allegedly relied on the psychologist Aleksandr Kogan (who also goes by Aleksandr Spectre), a colleague of the original researchers at Cambridge, to gain access to profiles of around 50 million Facebook users.
According to the Guardian’s and New York Times’ reporting, the data that was used to build these models came from a rough duplicate of that personality quiz method used legitimately for scientific research. Kogan, a lecturer in another department, reportedly approached Kosinski and their Cambridge colleagues in the Psychometric Centre to discuss commercializing the research. To his credit, Kosinski declined. However, Kogan built an app named thisismydigitallife for his own startup, Global Science Research, which collected the same sorts of data. GSR paid Mechanical Turk workers (contrary to the terms of Mechanical Turk) to take a psychological quiz and provide access to their Facebook profiles. In 2014, under the contract with the parent company of Cambridge Analytica, SCL, that data was harvested and used to build a model of 50 million U.S. Facebook users that included allegedly 5,000 data points on each user.
So if the Facebook API allowed Kogan access to this data, what did he do wrong? This is where things get murky, but bear with us. It appears that Kogan deceitfully used his dual roles as a researcher and an entrepreneur to move data between an academic context and a commercial context, although the exact method of it is unclear. The Guardian claims that Kogan “had a licence from Facebook to collect profile data, but it was for research purposes only” and “[Kogan’s] permission from Facebook to harvest profiles in large quantities was specifically restricted to academic use.” Transferring the data this way would already be a violation of the terms of Facebook’s API policies that barred use of the data outside of Facebook for commercial uses, but we are unfamiliar with Facebook offering a “license” or special “permission” for researchers to collect greater amounts of data via the API.
Regardless, it does appear that the amount of data thisismydigitallife was vacuuming up triggered a security review at Facebook and an automatic shutdown of its API access. Relying on Wylie’s narrative, the Guardian claims that Kogan “spoke to an engineer” and resumed access:
“Facebook could see it was happening,” says Wylie. “Their security protocols were triggered because Kogan’s apps were pulling this enormous amount of data, but apparently Kogan told them it was for academic use. So they were like, ‘Fine’.”
Kogan claims that he had a close working relationship with Facebook and that it was familiar with his research agendas and tools.
A great deal of research confirms that most people don’t pay attention to permissions and privacy policies for the apps they download and the services they use—and the notices are often too vague or convoluted to clearly understand anyway. How many Facebook users give third parties access to their profile so that they can get a visualization of the words they use most, or to find out which Star Wars character they are? It isn’t surprising that Kosinski’s original recruitment method—a personality quiz that provided you with a psychological profile of yourself based on a common five-factor model—resulted in more than 50,000 volunteers providing access to their Facebook data. Indeed, Kosinski later co-authored a paper detailing how to use viral marketing techniques to recruit study participants, and he has written about the ethical dynamics of utilizing friend data.