• ’Fiction is outperforming reality’: how YouTube’s algorithm distorts truth | Technology | The Guardian
    https://www.theguardian.com/technology/2018/feb/02/how-youtubes-algorithm-distorts-truth

    There are 1.5 billion YouTube users in the world, which is more than the number of households that own televisions. What they watch is shaped by this algorithm, which skims and ranks billions of videos to identify 20 “up next” clips that are both relevant to a previous video and most likely, statistically speaking, to keep a person hooked on their screen.

    Company insiders tell me the algorithm is the single most important engine of YouTube’s growth. In one of the few public explanations of how the formula works – an academic paper that sketches the algorithm’s deep neural networks, crunching a vast pool of data about videos and the people who watch them – YouTube engineers describe it as one of the “largest scale and most sophisticated industrial recommendation systems in existence”.

    Lewd and violent videos have been algorithmically served up to toddlers watching YouTube Kids, a dedicated app for children. One YouTube creator who was banned from making advertising revenues from his strange videos – which featured his children receiving flu shots, removing earwax, and crying over dead pets – told a reporter he had only been responding to the demands of Google’s algorithm. “That’s what got us out there and popular,” he said. “We learned to fuel it and do whatever it took to please the algorithm.”

    During the three years he worked at Google, he was placed for several months with a team of YouTube engineers working on the recommendation system. The experience led him to conclude that the priorities YouTube gives its algorithms are dangerously skewed.

    “YouTube is something that looks like reality, but it is distorted to make you spend more time online,” he tells me when we meet in Berkeley, California. “The recommendation algorithm is not optimising for what is truthful, or balanced, or healthy for democracy.”

    Chaslot explains that the algorithm never stays the same. It is constantly changing the weight it gives to different signals: the viewing patterns of a user, for example, or the length of time a video is watched before someone clicks away.

    The engineers he worked with were responsible for continuously experimenting with new formulas that would increase advertising revenues by extending the amount of time people watched videos. “Watch time was the priority,” he recalls. “Everything else was considered a distraction.”

    The software Chaslot wrote was designed to provide the world’s first window into YouTube’s opaque recommendation engine. The program simulates the behaviour of a user who starts on one video and then follows the chain of recommended videos – much as I did after watching the Logan Paul video – tracking data along the way.

    It finds videos through a word search, selecting a “seed” video to begin with, and recording several layers of videos that YouTube recommends in the “up next” column. It does so with no viewing history, ensuring the videos being detected are YouTube’s generic recommendations, rather than videos personalised to a user. And it repeats the process thousands of times, accumulating layers of data about YouTube recommendations to build up a picture of the algorithm’s preferences.

    Over the last 18 months, Chaslot has used the program to explore bias in YouTube content promoted during the French, British and German elections, global warming and mass shootings, and published his findings on his website, Algotransparency.com. Each study finds something different, but the research suggests YouTube systematically amplifies videos that are divisive, sensational and conspiratorial.

    It was not a comprehensive set of videos and it may not have been a perfectly representative sample. But it was, Chaslot said, a previously unseen dataset of what YouTube was recommending to people interested in content about the candidates – one snapshot, in other words, of the algorithm’s preferences.

    Jonathan Albright, research director at the Tow Center for Digital Journalism, who reviewed the code used by Chaslot, says it is a relatively straightforward piece of software and a reputable methodology. “This research captured the apparent direction of YouTube’s political ecosystem,” he says. “That has not been done before.”

    I spent weeks watching, sorting and categorising the trove of videos with Erin McCormick, an investigative reporter and expert in database analysis. From the start, we were stunned by how many extreme and conspiratorial videos had been recommended, and the fact that almost all of them appeared to be directed against Clinton.

    Some of the videos YouTube was recommending were the sort we had expected to see: broadcasts of presidential debates, TV news clips, Saturday Night Live sketches. There were also videos of speeches by the two candidates – although, we found, the database contained far more YouTube-recommended speeches by Trump than Clinton.

    But what was most compelling was how often Chaslot’s software detected anti-Clinton conspiracy videos appearing “up next” beside other videos.

    Tufekci, the sociologist who several months ago warned about the impact YouTube may have had on the election, tells me YouTube’s recommendation system has probably figured out that edgy and hateful content is engaging. “This is a bit like an autopilot cafeteria in a school that has figured out children have sweet teeth, and also like fatty and salty foods,” she says. “So you make a line offering such food, automatically loading the next plate as soon as the bag of chips or candy in front of the young person has been consumed.”

    Once that gets normalised, however, what is fractionally more edgy or bizarre becomes, Tufekci says, novel and interesting. “So the food gets higher and higher in sugar, fat and salt – natural human cravings – while the videos recommended and auto-played by YouTube get more and more bizarre or hateful.”

    But why would a bias toward ever more weird or divisive videos benefit one candidate over another? That depends on the candidates. Trump’s campaign was nothing if not weird and divisive. Tufekci points to studies showing that “field of misinformation” largely tilted anti-Clinton before the election. “Fake news providers,” she says, “found that fake anti-Clinton material played much better with the pro-Trump base than did fake anti-Trump material with the pro-Clinton base.”

    She adds: “The question before us is the ethics of leading people down hateful rabbit holes full of misinformation and lies at scale just because it works to increase the time people spend on the site – and it does work.”

    About half the videos Chaslot’s program detected being recommended during the election have now vanished from YouTube – many of them taken down by their creators. Chaslot has always thought this suspicious. These were videos with titles such as “Must Watch!! Hillary Clinton tried to ban this video”, watched millions of times before they disappeared. “Why would someone take down a video that has been viewed millions of times?” he asks.

    I contacted Franchi to see who was right. He sent me screen grabs of the private data given to people who upload YouTube videos, including a breakdown of how their audiences found their clips. The largest source of traffic to the Bill Clinton rape video, which was viewed 2.4m times in the month leading up to the election, was YouTube recommendations.

    The same was true of all but one of the videos Franchi sent me data for. A typical example was a Next News Network video entitled “WHOA! HILLARY THINKS CAMERA’S OFF… SENDS SHOCK MESSAGE TO TRUMP” in which Franchi, pointing to a tiny movement of Clinton’s lips during a TV debate, claims she says “fuck you” to her presidential rival. The data Franchi shared revealed in the month leading up to the election, 73% of the traffic to the video – amounting to 1.2m of its views – was due to YouTube recommendations. External traffic accounted for only 3% of the views.

    Franchi is a professional who makes a living from his channel, but many of the other creators of anti-Clinton videos I spoke to were amateur sleuths or part-time conspiracy theorists. Typically, they might receive a few hundred views on their videos, so they were shocked when their anti-Clinton videos started to receive millions of views, as if they were being pushed by an invisible force.

    In every case, the largest source of traffic – the invisible force – came from the clips appearing in the “up next” column.

    #YouTube #Algorithme_recommendation #Politique_USA #Elections #Fake_news