Your Data is Being Manipulated – Data & Society : Points

/your-data-is-being-manipulated-a7e31a83

  • Your Data is Being Manipulated – Data & Society : Points
    https://points.datasociety.net/your-data-is-being-manipulated-a7e31a83577b

    Fast forward to 2003, when the sitting Pennsylvania senator Rick Santorum publicly compared homosexuality to bestiality and pedophilia. Needless to say, the LGBT community was outraged. Journalist Dan Savage called on his readers to find a way to “memorialize the scandal.” One of his fans created a website to associate Santorum’s name with anal sex. To the senator’s horror, countless members of the public jumped in to link to that website in an effort to influence search engines. This form of crowdsourced SEO is commonly referred to as “Google bombing,” and it’s a form of media manipulation intended to mess with data and the information landscape.

    At this moment, AI is at the center of every business conversation. Companies, governments, and researchers are obsessed with data. Not surprisingly, so are adversarial actors. We are currently seeing an evolution in how data is being manipulated. If we believe that data can and should be used to inform people and fuel technology, we need to start building the infrastructure necessary to limit the corruption and abuse of that data — and grapple with how biased and problematic data might work its way into technology and, through that, into the foundations of our society.

    Like search engines, social media introduced a whole new target for manipulation. This attracted all sorts of people, from social media marketers to state actors. Messing with Twitter’s trending topics or Facebook’s news feed became a hobby for many. For $5, anyone could easily buy followers, likes, and comments on almost every major site. The economic and political incentives are obvious, but alongside these powerful actors, there are also a whole host of people with less-than-obvious intentions coordinating attacks on these systems.

    The goal with a story like that isn’t to convince journalists that it’s true, but to get them to foolishly use their amplification channels to negate it. This produces a “Boomerang effect,” whereby those who don’t trust the media believe that there must be merit to the conspiracy, prompting some to “self-investigate.”

    Consider, for example, the role of reddit and Twitter data as training data. Computer scientists have long pulled from the very generous APIs of these companies to train all sorts of models, trying to understand natural language, develop metadata around links, and track social patterns. They’ve trained models to detect depression, rank news, and engage in conversation. Ignoring the fact that this data is not representative in the first place, most engineers who use these APIs believe that it’s possible to clean the data and remove all problematic content. I can promise you it’s not.

    I’m watching countless actors experimenting with ways to mess with public data with an eye on major companies’ systems. They are trying to fly below the radar. If you don’t have a structure in place for strategically grappling with how those with an agenda might try to route around your best laid plans, you’re vulnerable. This isn’t about accidental or natural content. It’s not even about culturally biased data. This is about strategically gamified content injected into systems by people who are trying to guess what you’ll do.

    If you are building data-driven systems, you need to start thinking about how that data can be corrupted, by whom, and for what purpose.

    L’article est si intéressant qu’il faut faire attention à ne pas le copier en entier ici ;-)

    #danah_boyd #Machine_learning #médias_sociaux #data #fake_news

  • Your Data is Being Manipulated - danah boyd – Data & Society: Points
    https://points.datasociety.net/your-data-is-being-manipulated-a7e31a83577b

    Practical Black-Box Attacks against Machine, March 19, 2017. The images in the top row are altered to disrupt the neural network leading to the misinterpretation on the bottom row. The alterations are not visible to the human eye.

    #IA #machine_learning #manipulation #pirates #données via @amtpl

    Only when journalists shame us by finding ways to trick our systems into advertising to neo-Nazis do we pay attention. Yet, far more maliciously intended actors are starting to play the long game in messing with our data. Why aren’t we trying to get ahead of this?

    un peu de #théorie_des_jeux ?