• Academia in a stranglehold

    Academic publishers’ most valuable asset used to be their journals. Now, it’s the data they collect from researchers and then sell. That is extremely concerning, a growing group of Groningen researchers feels. ‘They control every part of the process and register every action you take.’

    When UG philosopher Titus Stahl is mulling over a new research topic, he has a range of tools available to help him get started. He could use academic search engine Scopus, for example, to point him to articles he could read online or download. He might also take notes using Mendeley, the useful software tool that helps you keep track of sources and references.

    If he then writes a grant proposal to get funding, there’s a good chance that the people assessing it use SciVal – software that analyses research trends, but also contains individual researchers’ citation and publication data.

    In the meantime, he could discuss his work on SSRN, a social platform used to share and peer-review early-stage research. And ultimately, of course, he’ll publish it in an open access magazine for which the university has paid article processing fees, or APCs, after which others will read it – at home, through their libraries, or again using Scopus.

    Then, finally, he will enter his article in Pure, the database the university uses to register all research done at the UG. His profile might change to reflect he has done research on a new topic. Affiliations may be added, since his network has changed too, so everyone can see who he collaborates with and what his strengths are.

    It’s all very streamlined and it all works beautifully. However, it doesn’t seem all that great anymore when you realise that every tool Stahl has been using, every platform on which he publishes, is owned by publishing mogul Elsevier. And Elsevier not only provides tools, it also collects user data. It logs everything Stahl does, every keystroke.

    ‘They know what you are working on, they know what you are submitting, they know the results of your peer reviews’, Stahl says. ‘They control every part of the process and register every action you take.’
    Everything is recorded

    And that gives them far more information than you might realise. When Eiko Fried, a psychologist from the University of Leiden, asked Elsevier for his personal data in December 2021, he received an email with hundreds of thousands of data points, going back many years.

    He discovered that Elsevier knew his name, his affiliations and his research. That his reviews had been registered, as well as the requests for peer review he had declined. Elsevier kept track of his IP-addresses – leading back to his home – his private telephone numbers, and the moments he logged in, which showed exactly when he worked and when he was on vacation. There were websites he visited, articles he had downloaded or just viewed online. Every click, every reference was recorded.

    Fried’s blog posts about this came as a shock and a revelation to Stahl. ‘It’s a long-term danger to academic freedom’, he says. ‘They control the academic process with an infrastructure that serves their interests, not ours. And they use the collected data to provide analytics services to whoever pays for them.’

    Stahl is one of a growing group of researchers inside and outside the University of Groningen who are concerned about the situation. He finds Oskar Gstrein on his side. ‘There is this ingrained power imbalance between the universities and the publishers’, says the data autonomy specialist with Campus Fryslân and the Jantina Tammes School of Digital Society, Technology and AI. ‘They own the journals people want to get into. And now they have taken over the whole publishing sphere.’

    In a recently published call for action, the Young Academy Groningen (YAG) and the Open Science Community Groningen, too, sounded the alarm. ‘It is time to formulate a long-term vision for a sustainable, independent higher education system’, they wrote. ‘We not only endorse this ambition but call on our university to reclaim ownership over our research output.’
    New business model

    They have reason to worry. Big publishers like Elsevier make billions of euros a year. Historically by publishing academic articles, but they have recently changed their business model. Now, they sell data connected to academic publishing. And that is ‘insanely profitable’, Stahl says.

    Profits of Elsevier’s parent company RELX rose to 10 percent in 2023 – 2 billion euros on a revenue of 10 billion. ‘Article submissions returned to strong growth, with pay-to-publish open-access articles continuing to grow particularly strongly’, RELX reported in February this year.

    Elsevier’s Erik Engstrom was the third highest paid CEO in the Netherlands between 2017 and 2020, earning over 30 million. Only the CEOs of Shell and another scientific publisher, Wolters Kluwer, earned more.

    Only a decade ago, it looked as if big publishers’ hold on academia was weakening. Universities and the Dutch government were done with first funding their research with public money, offering their papers for free to publishers like Elsevier (The Lancet, Cell), Springer (Nature) or Wiley (Advanced Materials), editing and peer reviewing those papers for free and then having to pay insane amounts of subscription fees to make those same papers available to their researchers again.

    They moved towards open access and as a result, 97 percent of the publications in Groningen is now published open access. ‘Their traditional business model no longer worked’, says Gstrein. ‘So publishers had to reinvent themselves.’
    Gold open access

    And that is exactly what they did, helped by a Dutch government that suggested ‘gold open access’ as the norm. ‘It’s undoubtedly linked to the fact that many of these publishers, such as Elsevier and Kluwer, have Dutch roots’, says Ane van der Leij, head of research support at the UB.

    ‘Gold’ means you don’t make the whole publishing process free – that would be the diamond option. Instead, a university pays APCs up front for its researchers, and in exchange publishers make their articles available to everyone.

    That’s great for the general public, which can now read those articles for free. But it’s not so great for the universities that still provide research papers and edit academic articles without any payment. ‘And the APCs are high’, Stahl says. ‘In some cases, I estimate there’s a profit margin of 75 percent.’

    Not all magazines are open access, either. Most of the traditional journals are a hybrid now – the content for which APCs have been paid are open access; the rest is still behind a paywall. ‘Unfortunately “hybrid” has become the new status quo for most of these publishers’ journals, and it has become a very profitable business model’, Van der Leij says.
    Package deals

    These days, publishers negotiate ‘read and publish’ package deals for their titles, which have become around 20 percent more expensive in five years. ‘Taking into account an average inflation rate of 3 percent per year over this period, that amounts to a price increase of approximately 12.6 percent’, says Van der Leij.

    Elsevier has received over 16 million euros in 2024 for their deal with umbrella organisation Universities of the Netherlands. Wily gets almost 5 million, Springer 3.6 million.

    The increase is not the same for all publishers, Van der Leij stresses, and the packages themselves also vary, making it difficult to compare. ‘On top of that, it’s become increasingly difficult to figure out which parts are “read” and which ones are “publish”.’

    Also telling: the maximum number of prepaid publications is reached sooner every year, because universities get fewer publications for their money. ‘Four years ago, we would be sending out our emails that we’d reached the cap halfway through November. Three years ago, we did so in early November. This year, it was at the end of October’, says Van der Leij.
    Commercialised research

    That’s not the biggest issue, though. What is worse is the other part of Elsevier’s business model, which they came up with when they realised they needed other ways to keep making money. And the hottest commodity they could think of was data.

    ‘The entire infrastructure that science builds on is commercialised’, says YAG chairperson Lukas Linsi. ‘They effectively turn science into shareholder returns and dividend payouts, and it’s all public money.’

    Pure, which showcases the research for almost all Dutch universities, used to be an independent Danish startup, but was bought by Elsevier in 2012. The formerly open platform Mendeley was acquired in 2013. SSRN in 2016. In 2017 Elsevier bought bepress, a repository used by six hundred academic institutes. ‘It has given them real time data access’, Gstrein says.

    Publishing is no longer the main focus for RELX and its competitors; instead, they have become data brokers. They sell data to insurance companies for risk analysis, to banks for fraud detection, to universities to assess their performance, and, especially egregious, to the US Immigration Service to target illegal immigrants.
    Less dependent

    Many researchers are worried by this. ‘In the Netherlands, universities tend to be a bit optimistic regarding these companies’, Stahl feels. After all, universities have in the past made plans to develop ‘professional services’ together with Elsevier. ‘They just don’t seem to see the danger.’

    In Germany and France, there is much more awareness about these issues. There, universities are less dependent on the big publishers, or are working to move away from them. ‘If some private parties have access to all this data, then that is a long-term threat to academic freedom. We have to do something about it. We need our own infrastructure’, Stahl says.

    Per the contracts, the publishers aren’t allowed to share data. ‘There is the data processing agreement’, explains Marijke Folgering, head of the UB’s development & innovation department. That’s a legally required document stating how data will be processed. ‘They’re not allowed to just use our data. I’m sure they can find ways to do so anyway, but we also enter into these contracts on a trust basis. If they do abuse it, they hopefully hurt themselves as well.’
    Critical

    Researcher Taichi Ochi with the Open Science Community Groningen has his doubts about their trustworthiness, though. ‘We need to move away from them, or we risk detrimental effects’, he says.

    Linsi points to the deal that academic publisher Taylor and Francis made: they sold the research published in their three thousand academic journals to Microsoft for 10 million dollars, to train their AI models. ‘This is happening now!’

    Folgering and Van der Leij with the UB also worry about the seemingly unending stream of data that is flowing towards the publishers. ‘There are currently no indications that the system is being abused’, says Van der Leij, ‘but we’re getting increasingly concerned.’

    ‘We’re definitely critical of what they’re doing’, Folgering agrees. ‘We’re exploring our options. Several German universities have gone in a different direction. But there are limits to what we can do. We simply don’t have that many developers.’

    The problem, of course, is that both researchers and university management want convenience. They want their publications in these publishers’ prestigious distribution channels. They want their tools and software to work quickly, and it’s all the better if these are available at a relatively low cost. ‘But we just don’t consider what that means in the long run. People underestimate how little choice we still have’, Gstrein says.
    Long-standing reputation

    The researchers don’t have an easy solution on hand. ‘If you want to move ahead in your career, you’re dependent on these companies’, Linsi realises. ‘They don’t decide what is published in their journals, but still, it’s their brands that are really important if you want to move up.’

    Diamond open access journals – like the UG’s own University of Groningen Press, founded in 2015 – may be a solution in the long term, but at this point their reputation just isn’t good enough yet, compared to journals with a long-standing reputation and impact factor.

    The tools the publishers provide do work very well, Linsi admits. Repositories in for example Germany – where universities are a lot less dependent on the big publishers – aren’t nearly as ‘attractive’ as the UG’s Pure.

    And there’s the matter of safety too. Are universities able to build alternatives that are safe and that won’t be vulnerable to hacks? ‘In practice, this is quite difficult’, Linsi says. ‘But other countries show that there is a way back.’
    Alternatives

    The UG could start by using alternatives when possible, he explains. Zotero instead of Mendeley, Firefox instead of Google Chrome. ‘There are alternatives for almost every app we use.’

    And it could – and should – find an alternative for Pure. ‘It’s a good first step’, Linsi feels. ‘It’s relatively easy and it is tangible.’

    In fact, Van der Leij says, the UG is currently working on its own data warehouse that would hold all the UG publications’ data and metadata. ‘It might allow us to stop using Pure and keep a hold of our data.’

    But it would be even better if Dutch – or even European – universities worked together on projects like these, to make sure there’s enough funding and that it is done right. In the long run, Linsi believes, it will probably be cheaper than paying huge sums of money to commercial providers.

    ‘We must understand our own worth’, agrees Taichi Ochi. ‘With the cost of publishing ever increasing, it also impacts how much money we can spend on other activities. We need to move away from a model that is draining money.’

    https://ukrant.nl/magazine/elseviers-stranglehold-on-academia-how-publishers-get-rich-from-our-data

    #science #recherche #université #données #édition_scientifique #publications #publications_scientifiques #Elsevier #business #données_personnelles #Stahl #RELX #Springer #Wiley #Gold_open_access

  • Xavier m’écrit

    Cette attaque a entrainé un accès non autorisé à une partie des données personnelles associées à votre compte abonné : nom, prénom, adresses email et postale, date et lieu de naissance, numéro de téléphone, identifiant abonné et données contractuelles (type d’offre souscrite, date de souscription, abonnement actif ou non).

    #free #sécurité #informatique #données_personnelles #vol_de_données

    Ça me fait toujours marrer quand derrière on te baratine que la CNIL a été prévenue, ce que j’apprécie surtout c’est qu’avec les données de plusieurs millions d’abonné·es free, la peine encourue pour revendre nos données ben c’est juste une goutte.

    Cette attaque a été notifiée à la Commission nationale de l’informatique et des libertés (CNIL) et à l’Agence nationale de la sécurité des systèmes d’information (ANSSI). Une plainte pénale a également été déposée auprès du procureur de la République. L’auteur de ce délit s’expose à une peine de 5 ans d’emprisonnement et de 150 000 € d’amende.

    Merci Xavier, je n’oublie pas ton ami Emmanuel qui a refilé nos données médicales à Microsoft évidemment aussi bien sécurisées que celles de tes abonné·es.

    • 150 000 € pour 19 millions d’abonné·es, pour la CNIL on vaut pas grand chose, ça fait moins de 0,008 € par abonné, même pas un centime.

    • Actuellement grande braderie des données personnelles des clients et des usagers : Boulanger, Truffaut, Cultura, SFR, l’Assurance retraite ou encore Meilleurtaux

      Comment ça vous n’avez pas encore votre carte de fidélité avec vos listes d’achats pour un beau profilage client ?

    • Hello. En France, il y a une grosse parano sur la communication des IBAN que j’ai déjà remarquée dans différentes circonstances. Je n’arrive pas trop à piger pourquoi. Ici, on donne son numéro pour un peu tout, à des potes pour qu’ils versent de l’argent, les asso ou les gens le mette sur leur site etc... Parce qu’on ne peut que verser de l’argent dessus et qu’il n’y a pas de chéquier ici (qui est considéré comme archaïque).

      Dans le cas de cette fuite données, à quoi peuvent-ils servir ? Il me semble que les données d’identité (date de naissance, nom, adresse postale et mail) sont des données plus sensibles, non ?

    • « Mises en vente par le pirate, les données personnelles de 19,2 millions de clients de l’opérateur auraient été achetées pour 175.000 dollars » « Le fichier de données clients de l’opérateur téléphonique aurait été acheté pour la somme de 175.000 dollars (environ 160.000 euros). » C’est tout bénef.

  • LinkedIn condamnée à 310 millions d’euros suite à notre plainte collective
    https://www.laquadrature.net/2024/10/25/linkedin-condamnee-a-310-millions-deuros-suite-a-notre-plainte-collect

    Après Google et Amazon, c’est au tour de Microsoft d’être condamnée pour non respect du droit des #Données_personnelles. Hier, l’autorité de protection des données irlandaise a adressé à Microsoft une amende de 310 millions…

  • Widerspruch gegen die elektronische Patientenakte (ePA)
    https://widerspruch-epa.de

    Voilà une possibilité de prévenir l’abus de tes informations personnelles médicales. A partir de janvier 2025 les assurances maladie allemandes s’autorisent à enrégistrer tes données médicales sur leurs serveur et de les revendre. Jusqu’à présent ces données existent seulement chez les médecins individuels et dans tes propres répertoires. Si tu préfères que cela ne change pas il faut que tu fasses opposition à l’enrégistrement central.

    Ce site web associatif (Patientenrechte und Datenschutz e.V.) t’aide à écrire la lettre nécessaire. Attention quand même à remplir leur formulaires avec des donnés factices. Tu les remplaceras avant d’envoyer la lettte d’opposition à ton assurance.

    Willkommen auf unserer Webseite! Wir sind ein Bündnis, welches sich für den Schutz Ihrer persönlichen medizinischen Daten einsetzt. Unser Ziel ist es, dass Sie die Kontrolle über Ihre Daten behalten. Deshalb bieten wir unseren Widerspruchs-Generator für ein opt-out bei der elektronischen Patientenakte (ePA) an.

    2021 war die ePA, die auf zentralen Servern gespeichert wird, als freiwillige Möglichkeit eingeführt worden. Die Nachfrage danach war gering. Ab Januar 2025 erhalten alle gesetzlich Versicherten1, die nicht widersprechen, automatisch eine solche ePA. Außerdem sind alle Behandelnden verpflichtet, die ePA mit dem Behandlungsdaten ihrer Patienten zu befüllen.

    Die ePA wird als wichtiges Instrument zur Verbesserung der medizinischen Versorgung beworben. Sie hat aber mehrere gravierende Schwächen, die aus unserer Sicht einen Widerspruch notwendig machen, um die äußerst sensiblen persönlichen medizinischen Daten zu schützen.

    Hier kommt der Widerspruch (opt-out) ins Spiel. Opt-out bedeutet, dass Sie Ihre Daten nicht in der ePA speichern lassen. Unser Generator hilft Ihnen dabei. Er ist einfach zu bedienen und erfordert nur wenige Schritte.

    Der Widerspruch (opt-out) beeinträchtigt nicht Ihre medizinische Versorgung. Ihre Ärzte und Psychotherapeuten speichern weiterhin die notwendigen Informationen in ihren praxisinternen Akten, um Ihnen die bestmögliche Diagnose und Unterstützung zu bieten.

    Wir hoffen, dass unser Service Ihnen hilft, eine informierte Entscheidung über Ihre medizinischen Daten zu treffen.

    Vielen Dank, dass Sie uns Ihr Vertrauen schenken.

    Weitere Informationen zur ePA und zum Widerspruch finden Sie unter “Häufig gestellte Fragen (FAQ)”.

    #Allemagne #vie_privée #données #iatrocratie

  • The world’s rivers faced the driest year in three decades in 2023, the UN weather agency says

    The U.N. weather agency is reporting that 2023 was the driest year in more than three decades for the world’s rivers, as the record-hot year underpinned a drying up of water flows and contributed to prolonged droughts in some places.

    The World Meteorological Organization also says glaciers that feed rivers in many countries suffered the largest loss of mass in the last five decades, warning that ice melt can threaten long-term water security for millions of people globally.

    “Water is the canary in the coalmine of climate change. We receive distress signals in the form of increasingly extreme rainfall, floods and droughts which wreak a heavy toll on lives, ecosystems and economies,” said WMO Secretary-General Celeste Saulo, releasing the report on Monday.

    She said rising temperatures had in part led the hydrological cycle to become “more erratic and unpredictable” in ways that can produce “either too much or too little water” through both droughts and floods.

    The “State of Global Water Resources 2023” report covers rivers and also lakes, reservoirs, groundwater, soil moisture, terrestrial water storage, snow cover and glaciers, and the evaporation of water from land and plants.

    The weather agency, citing figures from UN Water, says some 3.6 billion people face inadequate access to water for at least one month a year — and that figure is expected to rise to 5 billion by 2050. WMO says 70% of all the water that humans draw from the hydrological systems goes into agriculture.

    The world faced the hottest year on record in 2023, and the summer of this year was also the hottest summer ever — raising warning signs for a possible new annual record in 2024.

    “In the (last) 33 years of data, we had never such a large area around the world which was under such dry conditions,” said Stefan Uhlenbrook, director of hydrology, water and cryosphere at WMO.

    The report said the southern United States, Central America and South American countries Argentina, Brazil, Peru and Uruguay faced widespread drought conditions and “the lowest water levels ever observed in Amazon and in Lake Titicaca,” on the border between Peru and Bolivia.

    The Mississippi River basin also experienced record-low water levels, the report said. WMO said half of the world faced dry river flow conditions last year.

    The data for 2024 isn’t in yet, but Uhlenbrook said the extremely hot summer is “very likely” to translate into low river flows this year, and “in many parts of the world, we expect more water scarcity.”

    Low-water conditions have had an impact on river navigation in places like Brazil and a food crisis in Zimbabwe and other parts of southern Africa this year.

    WMO called for improvements in data collection and sharing to help clear up the real picture for water resources and help countries and communities take action in response.

    https://apnews.com/article/water-united-nations-world-meteorological-organization-86183afa4d917fe9777f7

    #rivières #sécheresse #rapport #statistiques #données #monde

  • #Data_center emissions probably 662% higher than big tech claims. Can it keep up the ruse?

    Emissions from in-house data centers of #Google, #Microsoft, #Meta and #Apple may be 7.62 times higher than official tally.

    Big tech has made some big claims about greenhouse gas emissions in recent years. But as the rise of artificial intelligence creates ever bigger energy demands, it’s getting hard for the industry to hide the true costs of the data centers powering the tech revolution.

    According to a Guardian analysis, from 2020 to 2022 the real emissions from the “in-house” or company-owned data centers of Google, Microsoft, Meta and Apple are probably about 662% – or 7.62 times – higher than officially reported.

    Amazon is the largest emitter of the big five tech companies by a mile – the emissions of the second-largest emitter, Apple, were less than half of Amazon’s in 2022. However, Amazon has been kept out of the calculation above because its differing business model makes it difficult to isolate data center-specific emissions figures for the company.

    As energy demands for these data centers grow, many are worried that carbon emissions will, too. The International Energy Agency stated that data centers already accounted for 1% to 1.5% of global electricity consumption in 2022 – and that was before the AI boom began with ChatGPT’s launch at the end of that year.

    AI is far more energy-intensive on data centers than typical cloud-based applications. According to Goldman Sachs, a ChatGPT query needs nearly 10 times as much electricity to process as a Google search, and data center power demand will grow 160% by 2030. Goldman competitor Morgan Stanley’s research has made similar findings, projecting data center emissions globally to accumulate to 2.5bn metric tons of CO2 equivalent by 2030.

    In the meantime, all five tech companies have claimed carbon neutrality, though Google dropped the label last year as it stepped up its carbon accounting standards. Amazon is the most recent company to do so, claiming in July that it met its goal seven years early, and that it had implemented a gross emissions cut of 3%.

    “It’s down to creative accounting,” explained a representative from Amazon Employees for Climate Justice, an advocacy group composed of current Amazon employees who are dissatisfied with their employer’s action on climate. “Amazon – despite all the PR and propaganda that you’re seeing about their solar farms, about their electric vans – is expanding its fossil fuel use, whether it’s in data centers or whether it’s in diesel trucks.”
    A misguided metric

    The most important tools in this “creative accounting” when it comes to data centers are renewable energy certificates, or Recs. These are certificates that a company purchases to show it is buying renewable energy-generated electricity to match a portion of its electricity consumption – the catch, though, is that the renewable energy in question doesn’t need to be consumed by a company’s facilities. Rather, the site of production can be anywhere from one town over to an ocean away.

    Recs are used to calculate “market-based” emissions, or the official emissions figures used by the firms. When Recs and offsets are left out of the equation, we get “location-based emissions” – the actual emissions generated from the area where the data is being processed.

    The trend in those emissions is worrying. If these five companies were one country, the sum of their “location-based” emissions in 2022 would rank them as the 33rd highest-emitting country, behind the Philippines and above Algeria.

    Many data center industry experts also recognize that location-based metrics are more honest than the official, market-based numbers reported.

    “Location-based [accounting] gives an accurate picture of the emissions associated with the energy that’s actually being consumed to run the data center. And Uptime’s view is that it’s the right metric,” said Jay Dietrich, the research director of sustainability at Uptime Institute, a leading data center advisory and research organization.

    Nevertheless, Greenhouse Gas (GHG) Protocol, a carbon accounting oversight body, allows Recs to be used in official reporting, though the extent to which they should be allowed remains controversial between tech companies and has led to a lobbying battle over GHG Protocol’s rule-making process between two factions.

    On one side there is the Emissions First Partnership, spearheaded by Amazon and Meta. It aims to keep Recs in the accounting process regardless of their geographic origins. In practice, this is only a slightly looser interpretation of what GHG Protocol already permits.

    The opposing faction, headed by Google and Microsoft, argues that there needs to be time-based and location-based matching of renewable production and energy consumption for data centers. Google calls this its 24/7 goal, or its goal to have all of its facilities run on renewable energy 24 hours a day, seven days a week by 2030. Microsoft calls it its 100/100/0 goal, or its goal to have all its facilities running on 100% carbon-free energy 100% of the time, making zero carbon-based energy purchases by 2030.

    Google has already phased out its Rec use and Microsoft aims to do the same with low-quality “unbundled” (non location-specific) Recs by 2030.

    Academics and carbon management industry leaders alike are also against the GHG Protocol’s permissiveness on Recs. In an open letter from 2015, more than 50 such individuals argued that “it should be a bedrock principle of GHG accounting that no company be allowed to report a reduction in its GHG footprint for an action that results in no change in overall GHG emissions. Yet this is precisely what can happen under the guidance given the contractual/Rec-based reporting method.”

    To GHG Protocol’s credit, the organization does ask companies to report location-based figures alongside their Rec-based figures. Despite that, no company includes both location-based and market-based metrics for all three subcategories of emissions in the bodies of their annual environmental reports.

    In fact, location-based numbers are only directly reported (that is, not hidden in third-party assurance statements or in footnotes) by two companies – Google and Meta. And those two firms only include those figures for one subtype of emissions: scope 2, or the indirect emissions companies cause by purchasing energy from utilities and large-scale generators.
    In-house data centers

    Scope 2 is the category that includes the majority of the emissions that come from in-house data center operations, as it concerns the emissions associated with purchased energy – mainly, electricity.

    Data centers should also make up a majority of overall scope 2 emissions for each company except Amazon, given that the other sources of scope 2 emissions for these companies stem from the electricity consumed by firms’ offices and retail spaces – operations that are relatively small and not carbon-intensive. Amazon has one other carbon-intensive business vertical to account for in its scope 2 emissions: its warehouses and e-commerce logistics.

    For the firms that give data center-specific data – Meta and Microsoft – this holds true: data centers made up 100% of Meta’s market-based (official) scope 2 emissions and 97.4% of its location-based emissions. For Microsoft, those numbers were 97.4% and 95.6%, respectively.

    The huge differences in location-based and official scope 2 emissions numbers showcase just how carbon intensive data centers really are, and how deceptive firms’ official emissions numbers can be. Meta, for example, reports its official scope 2 emissions for 2022 as 273 metric tons CO2 equivalent – all of that attributable to data centers. Under the location-based accounting system, that number jumps to more than 3.8m metric tons of CO2 equivalent for data centers alone – a more than 19,000 times increase.

    A similar result can be seen with Microsoft. The firm reported its official data center-related emissions for 2022 as 280,782 metric tons CO2 equivalent. Under a location-based accounting method, that number jumps to 6.1m metric tons CO2 equivalent. That’s a nearly 22 times increase.

    While Meta’s reporting gap is more egregious, both firms’ location-based emissions are higher because they undercount their data center emissions specifically, with 97.4% of the gap between Meta’s location-based and official scope 2 number in 2022 being unreported data center-related emissions, and 95.55% of Microsoft’s.

    Specific data center-related emissions numbers aren’t available for the rest of the firms. However, given that Google and Apple have similar scope 2 business models to Meta and Microsoft, it is likely that the multiple on how much higher their location-based data center emissions are would be similar to the multiple on how much higher their overall location-based scope 2 emissions are.

    In total, the sum of location-based emissions in this category between 2020 and 2022 was at least 275% higher (or 3.75 times) than the sum of their official figures. Amazon did not provide the Guardian with location-based scope 2 figures for 2020 and 2021, so its official (and probably much lower) numbers were used for this calculation for those years.
    Third-party data centers

    Big tech companies also rent a large portion of their data center capacity from third-party data center operators (or “colocation” data centers). According to the Synergy Research Group, large tech companies (or “hyperscalers”) represented 37% of worldwide data center capacity in 2022, with half of that capacity coming through third-party contracts. While this group includes companies other than Google, Amazon, Meta, Microsoft and Apple, it gives an idea of the extent of these firms’ activities with third-party data centers.

    Those emissions should theoretically fall under scope 3, all emissions a firm is responsible for that can’t be attributed to the fuel or electricity it consumes.

    When it comes to a big tech firm’s operations, this would encapsulate everything from the manufacturing processes of the hardware it sells (like the iPhone or Kindle) to the emissions from employees’ cars during their commutes to the office.

    When it comes to data centers, scope 3 emissions include the carbon emitted from the construction of in-house data centers, as well as the carbon emitted during the manufacturing process of the equipment used inside those in-house data centers. It may also include those emissions as well as the electricity-related emissions of third-party data centers that are partnered with.

    However, whether or not these emissions are fully included in reports is almost impossible to prove. “Scope 3 emissions are hugely uncertain,” said Dietrich. “This area is a mess just in terms of accounting.”

    According to Dietrich, some third-party data center operators put their energy-related emissions in their own scope 2 reporting, so those who rent from them can put those emissions into their scope 3. Other third-party data center operators put energy-related emissions into their scope 3 emissions, expecting their tenants to report those emissions in their own scope 2 reporting.

    Additionally, all firms use market-based metrics for these scope 3 numbers, which means third-party data center emissions are also undercounted in official figures.

    Of the firms that report their location-based scope 3 emissions in the footnotes, only Apple has a large gap between its official scope 3 figure and its location-based scope 3 figure.

    This is the only sizable reporting gap for a firm that is not data center-related – the majority of Apple’s scope 3 gap is due to Recs being applied towards emissions associated with the manufacturing of hardware (such as the iPhone).

    Apple does not include transmission and distribution losses or third-party cloud contracts in its location-based scope 3. It only includes those figures in its market-based numbers, under which its third party cloud contracts report zero emissions (offset by Recs). Therefore in both of Apple’s total emissions figures – location-based and market-based – the actual emissions associated with their third party data center contracts are nowhere to be found.”

    .
    2025 and beyond

    Even though big tech hides these emissions, they are due to keep rising. Data centers’ electricity demand is projected to double by 2030 due to the additional load that artificial intelligence poses, according to the Electric Power Research Institute.

    Google and Microsoft both blamed AI for their recent upticks in market-based emissions.

    “The relative contribution of AI computing loads to Google’s data centers, as I understood it when I left [in 2022], was relatively modest,” said Chris Taylor, current CEO of utility storage firm Gridstor and former site lead for Google’s data center energy strategy unit. “Two years ago, [AI] was not the main thing that we were worried about, at least on the energy team.”

    Taylor explained that most of the growth that he saw in data centers while at Google was attributable to growth in Google Cloud, as most enterprises were moving their IT tasks to the firm’s cloud servers.

    Whether today’s power grids can withstand the growing energy demands of AI is uncertain. One industry leader – Marc Ganzi, the CEO of DigitalBridge, a private equity firm that owns two of the world’s largest third-party data center operators – has gone as far as to say that the data center sector may run out of power within the next two years.

    And as grid interconnection backlogs continue to pile up worldwide, it may be nearly impossible for even the most well intentioned of companies to get new renewable energy production capacity online in time to meet that demand.

    https://www.theguardian.com/technology/2024/sep/15/data-center-gas-emissions-tech
    #données #émissions #mensonge #ChatGPT #AI #IA #intelligence_artificielle #CO2 #émissions_de_CO2 #centre_de_données

  • Dès 2025 : amende pour les foyers non équipés de compteurs Linky
    https://ricochets.cc/Des-2025-racket-des-foyers-non-equipes-de-compteurs-Linky-7829.html

    En 2025, les amendes (déguisée en « frais de gestion ») sont promises aux foyers non qui ont refusé le Linky jusqu’ici. Il est indiqué que seules les impossibilités techniques seront exemptées d’amende. Plus bas, quelques rappels sur le fait que des données personnelles de consommation peuvent intéresser la police. Il est question d’autour de 60 €/an ! Un prix élevé pour faire pression, qui apparamment ne correspond pas à leurs éventuels frais (envoyer des courriels et maintenir une page (...) #Les_Articles

    / #Technologie, #Fichage,_contrôle_et_surveillance

    https://www.clubic.com/electricite/actualite-453087-linky-ne-sera-finalement-pas-obligatoire-enedis-confirme.ht
    https://www.lesechos.fr/industrie-services/energie-environnement/compteurs-linky-une-bonne-affaire-pour-enedis-130494
    https://www.quechoisir.org/billet-du-president-linky-les-consommateurs-financent-bien-le-deploiemen
    https://www.quechoisir.org/actualite-compteur-linky-la-cour-des-comptes-tres-critique-n51752
    https://www.inc-conso.fr/content/compteur-linky-et-donnees-personnelles
    https://www.hellowatt.fr/suivi-consommation-energie/compteur-linky/donnees-personnelles-protection
    https://www.dalloz-actualite.fr/flash/compteurs-linky-cnil-met-en-demeure-engie-et-edf-pour-des-manquemen

    • Enedis assure que les données de consommation électriques relativement précises (courbe de charge) qu’il reçoit peuvent être gardées confidentielles si le client coche (ou ne coche pas) la case, mais on se doute bien que la police aura accès à ses données sur mesure.
      Refuser l’enregistrement de la « courbe de charge » pour ne pas offrir des données personnelles précises de consommation
      Pour parer à ce problème, il semble, pour l’instant, possible de refuser la transmission de ces #données de consommation à #Enedis, et mieux, d’opter pour refuser l’enregistrement de ces données (seul le total mensuel serait alors enregistré).

      Il convient de recommander aux ayants droits susceptibles de subir des contrôles CAF de refuser l’enregistrement des données. Les agents de contrôle épluchent déjà les histogrammes de conso pour vérifier la durée annuelle de séjour des contrôlés dans leur habitation principale, qui conditionne le droit aux prestations.

      #linky

  • FakeYou Text
    https://aichief.com/ai-audio-tools/fakeyou

    FakeYou is an AI-powered platform that specializes in converting text into speech and transforming voice recordings into different voices using advanced deepfake technology. The platform offers a range of services, including text-to-speech (TTS), voice-to-voice conversion, and video lip-syncing. In addition, you can input text or audio and choose from a wide selection of voices, such […]

    #AI_Audio_Tools #AI_Web_App #Review

  • Voix sans issue ? Amazon officialise une narration audio par l’IA
    https://actualitte.com/article/119155/audiolivres/voix-sans-issue-amazon-officialise-une-narration-audio-par-l-ia

    La proposition d’Amazon aux narrateurs et narratrices sent tellement l’arnaque, comme toutes les propositions d’Amazon (ce que Cory Doctorow appelle « emmerdification »).

    L’amélioration des outils de synthèse vocale, grâce aux possibilités de l’intelligence artificielle, suscite l’intérêt de plusieurs acteurs du livre audio. Amazon et sa filiale Audible en tête : après le développement d’une solution à destination des auteurs autopubliés, la multinationale inaugure une offre dirigée vers les narrateurs eux-mêmes. Elle propose aux professionnels de « cloner » leurs voix, pour en faire des outils de l’IA, moyennant rémunération.

    Publié le :

    11/09/2024 à 11:08

    Antoine Oury

    8

    Partages
    Partager cet article sur Facebook
    Partager cet article sur Twitter
    Partager cet article sur Linkedin
    Partager cet article par mail
    Imprimer cet article
    ActuaLitté

    Amazon et Audible exploitent plus franchement les possibilités des technologies basées sur l’intelligence artificielle, avec l’ouverture d’un nouveau programme adressée aux narrateurs et narratrices professionnels. Sur le territoire américain uniquement, la firme leur propose de « cloner » leurs voix, pour que ces dernières soient ensuite utilisées à des fins de génération de livres audio.

    Autrement dit, les professionnels de la voix intéressés participeront à l’entrainement de l’intelligence artificielle d’Audible, qui prendra alors le relais, en reproduisant timbre, intonations et rythmes de lecture sur toute une variété de textes.

    En guise de compensation, lorsque la voix reproduite par l’IA sera utilisée pour lire un texte, le propriétaire de celle-ci sera rémunéré en recevant une part des revenus générés — Amazon n’a pas encore détaillé le pourcentage reversé.
    Une phase de test

    Dévoilée sur le blog d’ACX - pour Audiobook Creation Exchange, la place de marché d’Amazon pour la création de livres audio -, l’opération reste pour l’instant très fermée, et réservée à un petit nombre de narrateurs professionnels.

    Grâce à ce programme, « les participants peuvent développer leurs capacités de production de livres audio de haute qualité, générer de nouvelles activités en acceptant plus de projets simultanément et augmenter leurs revenus », promet la multinationale. D’après la publication, les narrateurs participants, même une fois leur voix « clonée » par l’IA, conserveront un contrôle sur les textes qu’ils « liront » de manière artificielle.

    À LIRE - En région PACA, l’IA observée sous toutes les coutures

    Par ailleurs, les narrateurs seront amenés, en utilisant les outils à disposition fournis par ACX, à contrôler la qualité de la lecture par l’IA, voire à corriger les erreurs éventuellement commises par cette dernière. La reproduction de leur voix par l’intelligence artificielle sera totalement gratuite pour les narrateurs intéressés, souligne Amazon.
    Diversifier le catalogue

    ACX travaille de longue date avec des narrateurs et narratrices professionnels, qu’il met en lien avec des auteurs, des éditeurs et des producteurs désireux de créer et commercialiser des livres audio. Cette accélération de la production, avec l’aide de l’intelligence artificielle, présente l’opportunité pour la firme d’étendre un peu plus son catalogue de titres disponibles.

    À LIRE - Amazon révèle une synthèse vocale “de pointe” avec BASE TTS

    Face à la concurrence de Spotify et d’autres acteurs du marché, Amazon entend accomplir pour le livre audio ce qu’elle a réalisé pour l’autopublication : devenir une plateforme incontournable, en proposant le plus grand nombre de références.

    Les auteurs autoédités ont déjà la possibilité de générer un livre audio à l’aide de l’intelligence artificielle, en s’appuyant sur la synthèse vocale — une voix totalement générée, qui ne s’inspire pas forcément d’une voix existante, donc. En mai dernier, Amazon avançait le chiffre de 40.000 livres audio générés automatiquement via ce programme. Des titres qui, produits avec les outils d’Amazon, resteront commercialisés par la firme avant tout...

    Photographie : illustration, murdelta, CC BY 2.0

    #Amazon #Livre_audio #Emmerdification #Voix_clonée

  • Microsoft’s Recall Feature on Windows 11 Not Removable After All
    https://digitalmarketreports.com/news/25091/microsoft-recall-feature-on-windows-11-not-removable-after-all

    Microsoft has confirmed that Windows 11 users will not be able to uninstall the controversial “Recall” feature, despite earlier reports suggesting otherwise. Recall, part of the Copilot+ suite announced in May, automatically captures screenshots of user activity on the operating system, ostensibly to help users easily retrieve past work.

    Oui, ils se foutent de not’gueule quand ils disent qu’ils nous ont entendu et que bon, ok, on va supprimer nos fonctionnalités attentatoires à la sécurité et à la vie privée.

  • Qu’est-ce que l’IA ? Illusions numériques, fausses promesses et rééducation de masse Brandon Smith − Alt-Market

    Au cours des cinq dernières années, le concept d’intelligence artificielle a fait l’objet d’une grande fanfare, à tel point que sa primauté est considérée dans les médias comme une évidence. L’idée que les algorithmes peuvent “penser” est devenue un mythe omniprésent, un fantasme de science-fiction qui prend vie. La réalité est beaucoup moins impressionnante…

    
Les globalistes du Forum économique mondial et d’autres institutions élitistes nous répètent sans cesse que l’IA est le catalyseur de la “quatrième révolution industrielle“, une singularité technologique censée changer à jamais tous les aspects de notre société. J’attends toujours le moment où l’IA fera quelque chose de significatif en termes d’avancement des connaissances humaines ou d’amélioration de nos vies. Ce moment n’arrive jamais. En fait, les globalistes ne cessent de déplacer les poteaux d’affichage de ce qu’est réellement l’IA.


    Je note que les zélateurs du WEF comme Yuval Harari parlent de l’IA comme s’il s’agissait de l’avènement d’une divinité toute puissante (je discute du culte globaliste de l’IA dans mon article “Intelligence Artificielle : Un regard séculaire sur l’antéchrist numérique“). Pourtant, Harari a récemment minimisé l’importance de l’IA en tant qu’intelligence sensible. Il affirme qu’elle n’a pas besoin d’atteindre la conscience de soi pour être considérée comme un super être ou une entité vivante. Il suggère même que l’image populaire d’une IA de type Terminator dotée d’un pouvoir et d’un désir individuels n’est pas une attente légitime.

    En d’autres termes, l’IA telle qu’elle existe aujourd’hui n’est rien de plus qu’un algorithme sans cervelle, et ce n’est donc pas de l’IA. Mais si tous les aspects de notre monde sont conçus autour d’infrastructures numériques et que l’on apprend à la population à avoir une foi aveugle dans l’ “infaillibilité” des algorithmes, alors nous finirons par devenir les dieux robots que les globalistes appellent de leurs vœux. En d’autres termes, la domination de l’IA n’est possible que si tout le monde CROIT que l’IA est légitime. Harari admet essentiellement cet agenda dans le discours ci-dessus.

    L’attrait de l’IA pour le commun des mortels réside dans la promesse de se libérer de tout souci ou de toute responsabilité. Comme tous les narcissiques, l’élite globaliste aime simuler l’avenir et acheter la conformité populaire en promettant des récompenses qui ne viendront jamais.

    Oui, les algorithmes sont actuellement utilisés pour aider les profanes à faire des choses qu’ils ne pouvaient pas faire auparavant, comme construire des sites web, réviser des dissertations, tricher aux examens universitaires, créer de mauvaises œuvres d’art et du contenu vidéo, etc. Les applications utiles sont rares. Par exemple, l’affirmation selon laquelle l’IA “révolutionne” le diagnostic et le traitement médicaux est tirée par les cheveux. Les États-Unis, le pays qui a sans doute le plus accès aux outils d’IA, souffrent également d’une baisse de l’espérance de vie. Nous savons qu’il ne s’agit pas de la Covid, car le virus a un taux de survie moyen de 99,8 %. On pourrait penser que si l’IA est si puissante dans sa capacité à identifier et à traiter les maladies, l’Américain moyen vivrait plus longtemps.

    Il n’existe aucune preuve d’un avantage unique de l’IA à une échelle sociale plus large. Tout au plus, il semble qu’elle permette de supprimer des emplois de développeurs web et d’employés de McDonald’s au “Drive” . L’idée globaliste selon laquelle l’IA va créer une renaissance robotique de l’art, de la musique, de la littérature et de la découverte scientifique est totalement absurde. L’IA s’est avérée n’être rien de plus qu’un outil de commodité médiocre, mais c’est en fait la raison pour laquelle elle est si dangereuse.

    Je soupçonne le WEF d’avoir changé ses idées sur ce que l’IA devrait être parce qu’elle ne répond pas aux aspirations délirantes qu’il avait à l’origine pour elle. Ils attendaient qu’un logiciel prenne vie et commence à leur donner des informations sur les mécanismes de l’univers, et ils commencent à se rendre compte que cela n’arrivera jamais. Au lieu de cela, les élitistes se concentrent de plus en plus sur la fusion du monde humain et du monde numérique. Ils veulent fabriquer la nécessité de l’IA parce que la dépendance de l’homme à l’égard de la technologie sert les objectifs de la centralisation.
    
Mais à quoi cela ressemblerait-il en réalité ? Eh bien, il faut que la population continue à devenir plus stupide tandis que l’IA s’intègre de plus en plus à la société.

    Par exemple, il est aujourd’hui largement admis qu’une formation universitaire n’est pas un gage d’intelligence ou de compétence. Des millions de diplômés entrant sur le marché du travail aujourd’hui font preuve d’un niveau d’incompétence déconcertant. Cela s’explique en partie par le fait que les enseignants sont moins compétents, qu’ils ont des préjugés idéologiques et que le programme d’études moyen s’est dégradé. Mais nous devons aussi commencer à prendre en compte le nombre d’enfants qui suivent leur scolarité en utilisant ChatGPT et d’autres outils de triche.

    Ils n’ont pas besoin d’apprendre quoi que ce soit, l’algorithme et la caméra de leur téléphone portable font tout pour eux. Cette tendance est inquiétante, car les êtres humains ont tendance à emprunter le chemin le plus facile dans tous les aspects de la survie. La plupart des gens ont cessé d’apprendre à cultiver leur nourriture parce que l’agriculture industrielle le fait pour nous. Ils ont cessé d’apprendre à chasser parce qu’il y a des abattoirs et des camions frigorifiques.

    Aujourd’hui, de nombreux Zennials sont incapables de se faire à manger parce qu’ils peuvent recevoir des plats à emporter à leur porte à tout moment. Ils ne parlent presque plus au téléphone et ne créent plus de communautés physiques parce que les textos et les médias sociaux sont devenus les intermédiaires de l’interaction humaine.

    Oui, tout est “plus facile” , mais cela ne veut pas dire que tout est mieux.

    Ma grande crainte – L’avenir que je vois se profiler est un avenir dans lequel les êtres humains ne prennent plus la peine de penser. L’IA pourrait être considérée comme l’ultime accumulation de connaissances humaines ; une bibliothèque massive ou un cerveau numérique qui effectue toutes les recherches et réfléchit à votre place. Pourquoi apprendre quoi que ce soit quand l’IA “sait tout”  ? Mais c’est un mensonge.

    L’IA ne sait pas tout ; elle ne sait que ce que ses programmeurs veulent qu’elle sache. Elle ne vous donne que les informations que ses programmeurs veulent que vous ayez. Les globalistes l’ont bien compris et ils sentent bien le pouvoir qu’ils auront si l’IA devient une plateforme éducative de premier plan. Ils y voient un moyen d’inciter les gens à abandonner le développement personnel et la pensée individuelle.

    Voyez les choses sous cet angle : Si tout le monde commence à se tourner vers l’IA pour obtenir des réponses à toutes ses questions, alors tout le monde recevra exactement les mêmes réponses et arrivera exactement aux mêmes conclusions. Tout ce que l’IA a à faire, c’est de censurer activement toute information qui contredit le récit officiel.

    Nous avons eu un aperçu de cette situation orwellienne lors de la pandémie de Covid, lorsque des entreprises de haute technologie comme Google ont utilisé des algorithmes pour enterrer toutes les données qui prouvaient que la crise Covid n’était pas la menace que les autorités gouvernementales prétendaient qu’elle représentait. Pendant au moins trois ans, il était impossible d’aller sur YouTube et de trouver des informations alternatives sur le virus ou les vaccins. L’algorithme a obligé tout le monde à passer au crible une longue liste de sources officielles, dont beaucoup véhiculent des mensonges flagrants sur le masquage, la distanciation sociale, le taux de mortalité dû à la crise Covid et la sécurité des vaccins.

    Le pouvoir en place n’a même pas besoin de censurer ou de supprimer directement les informations qu’il n’aime pas. Il leur suffit de laisser l’algorithme dicter les résultats de recherche et d’enterrer la vérité à la page 10 000, là où personne ne la cherchera.

    Quel serait l’impact sur le citoyen moyen ? Supposons que l’IA soit programmée pour dicter le discours scientifique. Que se passerait-il si l’IA disait que le changement climatique provoqué par l’homme est une réalité indéniable et que “la science est établie” , sans jamais présenter la montagne de preuves contraires ? Personne ne cherchera les vraies données parce que l’IA les rendra impossibles à trouver. Tout le monde supposera que l’IA leur dit tout ce qu’il y a à savoir sur le sujet, mais il y a pire encore…

    De nombreux lecteurs se souviendront peut-être qu’il y a quelques mois, le système d’IA “Gemini” de Google a été programmé pour imposer l’IED à ses utilisateurs https://www.theverge.com/2024/2/21/24079371/google-ai-gemini-generative-inaccurate-historical . Chaque fois qu’une personne demandait à l’IA de créer une image historique, l’algorithme rendait tout le monde noir ou brun et souvent féminin. Les représentations d’hommes blancs étaient étrangement rares, malgré l’exactitude historique. Cela signifie des images sans fin de Highlanders noirs et bruns en Écosse, de Pères fondateurs noirs en Amérique, de papes catholiques féminins, de chevaliers asiatiques dans l’Europe médiévale, et même, ce qui est hilarant, de nazis noirs dans l’Allemagne de la Seconde Guerre mondiale.

    Les développeurs d’IA affirment souvent qu’une fois l’IA créée, ils ne contrôlent plus vraiment ce qu’elle fait et comment elle se développe. L’incident “Gemini” prouve que c’est un mensonge. L’IA peut définitivement être contrôlée, ou du moins modelée par le codage pour promouvoir la propagande que les programmeurs veulent qu’elle promeuve. Il n’existe pas d’IA autonome ; il y a toujours un agenda.

    En résumé, les globalistes souhaitent la prolifération de l’IA parce qu’ils savent que les gens sont paresseux et qu’ils utiliseront le système comme substitut à la recherche individuelle. Si cela se produit à grande échelle, l’IA pourrait être utilisée pour réécrire tous les aspects de l’histoire, corrompre les racines mêmes de la science et des mathématiques et transformer la population en un esprit de ruche baveux ; une écume bourdonnante de drones décérébrés consommant chaque proclamation de l’algorithme comme si elle était sacro-sainte.

    En ce sens, Yuval Harari a raison. L’IA n’a pas besoin de devenir sensible ou d’utiliser une armée de robots tueurs pour faire beaucoup de mal à l’humanité. Il lui suffit d’être suffisamment pratique pour que nous n’ayons plus envie de penser par nous-mêmes. Comme le “Grand et Puissant” OZ qui se cache derrière un rideau numérique, vous pensez acquérir des connaissances auprès d’un magicien alors que vous êtes en réalité manipulés par des vendeurs d’huile de serpent globalistes.

    Traduit par Hervé pour le Saker Francophone

    #Data #Données #IA #AI #Intelligence_Artificielle #High-tech #robotique #algorithme #artificial-intelligence #escroquerie #bidonnage #Manipulation #WEF

    Source et liens : https://lesakerfrancophone.fr/quest-ce-que-lia-illusions-numeriques-fausses-promesses-et-reeduc

  • Air Quality #Stripes

    This website shows the concentration of particulate matter air pollution (PM2.5) in cities around the world. Very few historical observations of PM2.5 exist before the year 2000 so instead we use data produced from a mix of computer model simulations and satellite observations.

    For the most recent years (2000-2021) we use a dataset that combines ground-level and satellite observations of PM2.5 concentrations from Van Donkelaar et at (2021, V5 0.1 degree resolution), this dataset can be found here.

    Satellite observations of PM2.5 aren’t available for the years before 1998, so instead we take the historical trend in air pollution concentrations from computer models (Turnock 2020); publicly available model data was taken from the Coupled Model Intercomparison Project (CMIP6) which is made freely available via the Earth System Grid Federation (ESGF), these are the climate models used for the IPCC assessment report. We used data from the UKESM submission to CMIP (data is here). The historical concentrations for the UKESM model are calculated using changes in air pollutant emissions obtained from the Community Emissions Data System (CEDS) inventory developed by Hoesly et al, 2018 and used as input to CMIP6 historical experiments.

    Modelling global concentrations of pollutants is very challenging, and models are continuously being evaluated against observations to improve their representation of physical and chemical processes. Previous research has shown that the CMIP6 multi-model simulations tend to underestimate PM2.5 concentrations when compared to global observations (Turnock et al., 2020). To address this issue and to ensure a smooth time series between the model and satellite data, we take the following steps: for each city, we first calculate a three-year (2000-2002) mean of the satellite data for that city. Next, we calculate the three-year (2000-2002) mean of model concentrations for the same city. The ratio between these values represents the model’s bias compared to observations. We then adjust (or “weight”) the model values using this ratio. This is a similar approach to that taken by Turnock et al. (2023) and Reddington et al. (2023).

    Because so few historical observations of PM2.5 exist, so it is challenging to evaluate how good this approximation is, but in our approach the historical trend is taken from the computer model and the values are informed by the satellite.

    This is the first versions of the Air Quality Stripes, they will be updated in the future as improved model simulations and observations become available. We welcome comments and suggestions for improvements!

    The data used to create these images is here: https://zenodo.org/records/13361899

    https://airqualitystripes.info

    #qualité_de_l'air #visualisation #données #statistiques #air #pollution_de_l'air #pollution #villes

    ping @reka via @freakonometrics

  • Comme si tout le reste n’était pas déjà suffisant (pour un petit aperçu, vous pouvez rester sur seenthis : https://seenthis.net/tag/elsevier), voici que je découvre que :
    Scientists : Elsevier has a shocking amount of data about you.
    https://fediscience.org/@ct_bergstrom/113010261685808797

    –—

    Welcome to Hotel Elsevier : you can check-out any time you like … not

    In December 2021, Robin Kok wrote a series of tweets about his Elsevier data access request. I did the same a few days later. This here is the resulting collaborative blog post, summarizing our journey in trying to understand what data Elsevier collects; what data Elsevier has collected on us two specifically; and trying to get this data deleted. A PDF version of this blog post is also available.

    Elsevier, data kraken

    Everybody in academia knows Elsevier. Even if you think you don’t, you probably do. Not only do they publish over 2,500 scientific journals, but they also own the citation database Scopus, as well as the ScienceDirect collection of electronic journals from which you get your papers. That nifty PURE system your university wants you to use to keep track of your publications and projects? You guessed it: Elsevier. And what about that marvelous reference manager, Mendeley? Elsevier bought it in 2013. The list goes on and on.

    But what exactly is Elsevier? We follow the advice of an Elsevier spokesperson: “if you think that information should be free of charge, go to Wikipedia”. Let’s do that! Wikipedia, in their core summary section, introduces Elsevier as “a Netherlands-based academic publishing company specializing in scientific, technical, and medical content.”

    The intro continues:

    And it’s not just rent-seeking. Elsevier admitted to writing “sponsored article compilation publications, on behalf of pharmaceutical clients, that were made to look like journals and lacked the proper disclosures“; offered Amazon vouchers to a select group of researchers to submit five star reviews on Amazon for certain products; manipulated citation reports; and is one of the leading lobbyists against open access and open science efforts. For this, Elsevier’s parent company, RELX, even employs two full-time lobbyists in the European Parliament, feeding “advice” into the highest levels of legislation and science organization. Here is a good summary of Elsevier’s problematic practices—suffice it to say that they’re very good at making profits.

    As described by Wikipedia, one way to make profits is Elsevier’s business as an academic publisher. Academics write articles for Elsevier journals for free and hand over copyright; other academics review and edit these papers for free; and Elsevier then sells these papers back to academics. Much of the labor that goes into Elsevier products is funded by public money, only for Elsevier to sell the finished products back e.g. to university libraries, using up even more public money.

    But in the 2020s—and now we come to the main topic of this piece—there is a second way of making money: selling data. Elsevier’s parent company RELX bills itself as “a global provider of information-based analytics and decision tools for professional and business customers”. And Elsevier itself has been busy with rebranding, too:

    This may sound irrelevant to you as a researcher, but here we show how Elsevier helps them to monetize your data; the amount of data they have on you; and why it will require major steps to change this troubling situation.
    Data access request

    Luckily, folks over at Elsevier “take your privacy and trust in [them] very seriously”, so we used the Elsevier Privacy Support Hub to start an “access to personal information” request. Being in the EU, we are legally entitled under the European General Data Protection Regulation (GDPR) to ask Elsevier what data they have on us, and submitting this request was easy and quick.

    After a few weeks, we both received responses by email. We had been assigned numbers 0000034 and 0000272 respectively, perhaps implying that relatively few people have made use of this system yet. The emails contained several files with a wide range of our data, in different formats. One of the attached excel files had over 700,000 cells of data, going back many years, exceeding 5mb in file size. We want to talk you through a few examples of what Elsevier knows about us.
    They have your data

    To start with, of course they have information we have provided them with in our interactions with Elsevier journals: full names, academic affiliations, university e-mail addresses, completed reviews and corresponding journals, times when we declined review requests, and so on.

    Apart from this, there was a list of IP addresses. Checking these IP addresses identified one of us in the small city we live in, rather than where our university is located. We also found several personal user IDs, which is likely how Elsevier connects our data across platforms and accounts. We were also surprised to see multiple (correct) private mobile phone numbers and e-mail addresses included.

    And there is more. Elsevier tracks which emails you open, the number of links per email clicked, and so on.

    We also found our personal address and bank account details, probably because we had received a small payment for serving as a statistical reviewer1. These €55 sure came with a privacy cost larger than anticipated.

    Data called “Web Traffic via Adobe Analytics” appears to list which websites we visited, when, and from which IP address. “ScienceDirect Usage Data” contains information on when we looked at which papers, and what we did on the corresponding website. Elsevier appears to distinguish between downloading or looking at the full paper and other types of access, such as looking at a particular image (e.g. “ArticleURLrequestPage”, “MiamiImageURLrequestPage”, and “MiamiImageURLreadPDF”), although it’s not entirely clear from the data export. This leads to a general issue that will come up more often in this piece: while Elsevier shared what data they have on us, and while they know what the data mean, it was often unclear for us navigating the data export what the data mean. In that sense, the usefulness of the current data export is, at least in part, questionable. In the extreme, it’s a bit like asking google what they know about you and they send you a file full of special characters that have no meaning to you.

    Going back to what data they have, next up: Mendeley. Like many, both of us have used this reference manager for years. For one of us, the corresponding tab in the excel file from Elsevier contained a whopping 213,000 lines of data, from 2016 to 2022. For the other, although he also used Mendeley extensively for years, the data export contained no information on Mendeley data whatsoever, a discrepancy for which we could not find an explanation. Elsevier appears to log every time you open Mendeley, and many other things you do with the software—we found field codes such as “OpenPdfIn InternalViewer”, “UserDocument Created”, “DocumentAnnotation Created”, “UserDocument Updated”, “FileDownloaded”, and so on.

    They use your data

    Although many of these data points seem relatively innocent at first, they can easily be monetized, because you can extrapolate core working hours, vacation times, and other patterns of a person’s life. This can be understood as detailed information about the workflow of academics – exactly the thing we would want to know if, like Elsevier, our goal was to be a pervasive element in the entire academic lifecycle.

    This interest in academic lifecycle data is not surprising, given the role of Elsevier’s parent company RELX as a global provider of information-based analytics and decision tools, as well as Elsevier’s rebranding towards an Information Analytics Business. Collecting data comes at a cost for a company, and it is safe to assume that they wouldn’t gather data if they didn’t intend to do something with it.

    One of the ways to monetize your data is painfully obvious: oldschool spam email tactics such as trying to get you to use more Elsevier services by signing you up for newsletters. Many academics receive unending floods of unsolicited emails and newsletters by Elsevier, which prompted one of us to do the subject access request in the first place. In the data export, we found a huge list of highly irrelevant newsletters we were unknowingly subscribed to—for one of us, the corresponding part of the data on “communications” has over 5000 rows.

    You agreed to all of this?

    Well, actually, now that you ask, we don’t quite recall consenting to Mendeley collecting data that could be used to infer information on our working hours and vacation time. After all, with this kind of data, it is entirely possible that Elsevier knows our work schedule better than our employers. And what about the unsolicited emails that we received even after unsubscribing? For most of these, it’s implausible that we would have consented. As you can see in the screenshot above, during one day (sorry, night!), at 3:20am, within a single minute, one of us “signed up” to no fewer than 50 newsletters at the same time – nearly all unrelated to our academic discipline.

    Does Elsevier really have our consent for these and other types of data they collected? The data export seems to answers this question, too, with aptly named columns such as “no consent” and “unknown consent”, the 0s and 1s probably marking “yes” or “no”.

    You can check-out any time you like…?

    Elsevier knows a lot about us, and the data they sent us in response to our access request may only scratch the surface. Although they sent a large volume of data, inconsistencies we found (like missing Mendeley data from one of us) make us doubt whether it is truly all the data they have. What to do? The answer seems straightforward: we can just stop donating our unpaid time and our personal and professional data, right? Indeed, more than 20,000 researchers have already taken a stand against Elsevier’s business practices, by openly refusing to publish in (or review / do editorial work for) Elsevier.

    But that does not really solve the problem we’re dealing with here. A lot of your data Elsevier might monetize is data you cannot really avoid to provide as an academic. For example, many of you will access full texts of papers through the ScienceDirect website, which often requires an institutional login. Given that the login is uniquely identifiable, they know exactly which papers you’ve looked at, and when. This also pertains to all of the other Elsevier products, some of which we briefly mentioned above, as well as emails. Many emails may be crucial for you (e.g. from an important journal), and Elsevier logs what emails you open and whether you click on links. Sure, this is probably standard marketing practice and Elsevier is not the only company doing it, but it doesn’t change the fact that as an active academic, you basically cannot avoid giving them data they can sell. In fact, just nominating someone for peer review can be enough to get them on their list. Did you ever realize that for most reviews you’re invited to, you actually never consented to being approached by the given journal?

    Elsevier has created a system where it seems impossible to avoid giving them your data. Dominating or at least co-dominating the market of academic publishing, they exploited free labor of researchers, and charged universities very high amounts of money so researchers could access scientific papers (which, in part, they wrote, reviewed and edited themselves). This pseudo-monopoly made Elsevier non-substitutable, which now allows their transition into a company selling your data.

    Worse, they say that “personal information that is integral to editorial history will be retained for as long as the articles are being made available”, as they write in their supporting information document on data collection and processing we received as part of the access request. What data exactly are integral to editorial history remains unclear.

    If not interacting with Elsevier is not a sustainable solution in the current infrastructure, maybe some more drastic measures are required. So one of us took the most drastic step available on Elsevier’s privacy hub: a deletion of personal information request.

    This was also promptly handled, but leaves two core concerns. First, it is not entirely clear to us what information was retained by Elsevier, for example, because they consider it “integral to editorial history”. And second, how sustainable is data deletion if all it takes to be sucked back into the Elsevier data ecosystem again is one of your colleagues recommending you as a reviewer for one of the 600,000 articles Elsevier publishes per year?

    Conclusion

    Some of the issues mentioned here, such as lack of consent, seem problematic to us from the perspective of e.g. European data protection laws. Is it ok for companies to sign us up to newsletters without consent? Is it ok to collect and retain personal data indefinitely because Elsevier argues it is necessary?

    And when Elsevier writes in the supporting information that they do “not undertake any automated decision making in relation to your personal information” (which may violate European laws), can that be true when they write, in the same document, that they are using personal information to tailoring experiences? “We are using your personal data for […] enhancing your experience of those products, for example by providing personalized recommendations based on your use of the products.”

    We are not legal scholars, and maybe there is no fire here. But from where we stand, there seems to be an awful lot of smoke. We hope that legal and privacy experts can bring clarity to the questions we raise above—because we simply don’t know what to do about a situation that is becoming increasingly alarming.

    https://eiko-fried.com/welcome-to-hotel-elsevier-you-can-check-out-any-time-you-like-not

    #données #édition_scientifique #Scopus #ScienceDirect #RELX #information_analytics #business

  • #Pollution de la #Seine : libérez les #données !

    Au nom du droit à l’information et pour rappeler la réalité face aux discours hors sol, nous publions tous les résultats auxquels nous avons eu accès sur la pollution de la Seine depuis l’ouverture des Jeux olympiques.

    AuAu 13e jour des Jeux olympiques, les données sur les mesures de pollution dans la Seine n’ont toujours pas été libérées. Pour connaître les taux de pollution bactériologique d’un des fleuves les plus connus au monde, dans l’une des villes les plus visitées au monde, le grand public est bien seul. Il ne peut compter que sur les médias relayant les points presse des organisateurs des Jeux – quand la qualité de l’eau de la Seine est à l’ordre du jour, c’est-à-dire uniquement lors des épreuves de triathlon et de nage en eau libre.

    Pourtant, ces informations sont par nature publiques et d’intérêt général. Le Code de l’environnement oblige les autorités à communiquer sans délai toute information relative à l’environnement (article L124-2). La directive européenne sur les eaux de baignade, en 2006, rend obligatoire l’affichage des mesures de pollution sur les sites de baignade.

    L’eau des rivières et des fleuves est un bien commun. Aucun événement sportif ou culturel, aussi spectaculaire soit-il, ne peut s’arroger le droit de privatiser les informations qui la concerne. Que la pollution soit d’origine virale, bactérienne, parasitaire ou chimique, elle doit être connue de toutes et tous. Pour respecter le droit à l’information sur la qualité de leur environnement. Mais aussi pour nourrir et attiser le regard et l’intérêt des citoyen·nes pour la Seine.

    Qu’elle ne soit plus considérée comme un simple mode de transport de péniches ou de bateaux-mouches pour touristes. Mais bien comme un élément central et vital pour Paris, ses habitant·es, et toute la faune et la flore qui en dépendent pour subsister.

    Enfin, face à des discours politiques hors sol, les mesures scientifiquement recueillies de taux de pollution biologique marquent un juste et nécessaire retour au réel.

    C’est la raison pour laquelle Mediapart a décidé de publier tous les résultats auxquels nous avons eu accès sur la pollution de la Seine depuis l’ouverture des Jeux olympiques, le 27 juillet.

    Jusqu’à l’ouverture des Jeux olympiques, le 26 juillet, la mairie de Paris publiait un « bulletin » hebdomadaire révélant – a posteriori – les résultats de 4 points de prélèvement. Diffusion qui s’est arrêtée avec le début des compétitions. Face à cette opacité, dans un souci d’information du public mais également pour que les chercheurs et chercheuses puissent s’emparer de ces informations, Mediapart publie ci-dessous les résultats d’analyse que nous avons pu récupérer, jusqu’au 5 août.

    Pour rappel, les seuils exigés par la fédération internationale de triathlon sont de 1 000 UFC/100 ml pour les E. coli et de 400 pour les entérocoques. Ces limites passent à 900 (pour les E. coli) et 330 (pour les entérocoques) pour les autorisations de baignade « grand public » de l’agence régionale de santé.

    Résultats des prélèvements du 5 août :

    Pour le 4 août, les chiffres communiqués par le comité d’organisation de Paris 2024 font état de résultats situés entre 727 et 1 553 UFC/100 ml d’E. coli.

    Résultats des prélèvements du 3 août :

    Résultats des prélèvements des 1er et 2 août :

    (et les autres résultats des analyses, dans l’article...)

    Pour le 27 juillet, les données n’ont pas été communiquées, mais la courbe montre que tous les points de contrôle étaient largement au-dessus des seuils autorisés (en raison notamment de l’orage de la veille, pendant la cérémonie d’ouverture).

    https://www.mediapart.fr/journal/france/070824/pollution-de-la-seine-liberez-les-donnees
    #chiffres #JO #jeux_olympiques #Paris

  • I proletari dell’intelligenza artificiale

    Come fa oggi un sito di commercio online a restituirci tutti i risultati che corrispondono a “maglia verde in seta” che stavamo cercando? Come fa un’auto che si guida da sola a riconoscere un pedone e a non investirlo? Come può Facebook capire che un certo contenuto è violento o pedopornografico e va bloccato? Come si comporta un chatbot per stabilire di quale informazione abbiamo bisogno? In tutti questi casi, la risposta è una: glielo insegna un essere umano.

    Un essere umano che guarda, analizza ed etichetta milioni di dati ogni giorno e li fornisce a quella che comunemente chiamiamo intelligenza artificiale (ia). L’intelligenza artificiale, per poter funzionare, ha bisogno di persone che la addestrino. E i suoi istruttori sono i nuovi proletari digitali. Quelli che si occupano delle mansioni più semplici, che si trovano alla base della piramide lavorativa del settore, i cui piani più alti sono occupati da analisti di dati, ingegneri o programmatori specializzati. Per insegnare all’intelligenza artificiale a riconoscere contenuti, e a crearne di nuovi, è necessario etichettare correttamente i dati, descrivere immagini, trascrivere testi, fare piccole traduzioni, identificare segnali stradali o altri elementi all’interno di immagini. I cosiddetti data labeling, gli etichettatori di dati, attraverso lavori spesso ripetitivi e alienanti, permettono l’addestramento dei software. Senza l’intervento umano, l’ia non sarebbe in grado di operare perché non saprebbe come interpretare i dati che le vengono sottoposti.

    “Quello che viene venduto come intelligenza artificiale è un tipo di apprendimento automatico, significa che bisogna nutrire la macchina con miliardi di dati, e sulla base di questo la macchina impara”, spiega Antonio Casilli, professore di sociologia al Telecom, l’istituto politecnico di Parigi, in Francia. “Per poter funzionare, che si tratti di creare un piccolo filtro di TikTok o software alla ChatGpt, c’è bisogno di masse enormi di dati, che devono però essere trattati, o meglio preaddestrati”. La “P” di chatGpt, che è l’acronimo di Generative pretrained transformer, significa infatti preaddestrato.

    Questo lavoro di preaddestramento è fatto però da persone che non sono quasi mai valorizzate. “Non vengono riconosciuti come i veri autori di questi prodigi tecnologici perché da una parte sono oscurati da professionisti molto più visibili, come i data scientist o gli ingegneri, e dall’altra perché non c’è interesse a far riconoscere l’intelligenza artificiale come una tecnologia labour intensive, cioè che ha bisogno di molto lavoro. L’intelligenza artificiale fa finta di essere una tecnologia che serve ad automatizzare il lavoro, e quindi a risparmiare, mentre invece ne richiede tantissimo”, spiega ancora Casilli.

    A sottolineare il concetto è anche Antonio Aloisi, che insegna diritto del lavoro all’università Ie di Madrid, in Spagna. “È sempre più evidente che l’imperfezione, l’incompletezza, l’inaccuratezza dei risultati, ha bisogno di un passaggio umano, che validi i risultati, che corregga gli errori e che faccia una prima verifica. In molte esperienze con i chatbot non c’è nulla di intelligente, ma soprattutto nulla di artificiale. I dati sono goffi, disfunzionali, per questo c’è bisogno di un ‘badante’ umano”.

    Quello degli istruttori è un lavoro a suo modo specializzato, ma quella specializzazione non è ben pagata, anzi è pagata malissimo. Non c’è interesse da parte delle aziende che reclutano questi lavoratori a riconoscerne le competenze, perché riconoscerle significherebbe pagarle. Casilli, con il suo gruppo di ricerca Diplab del politecnico di Parigi, uno dei tre al mondo che fa ricerca sul campo su questo tema, ha intervistato più di quattromila persone in venti paesi, soprattutto in quelli a basso reddito come Venezuela, Madagascar o Kenya, e ha raccolto e analizzato le esperienze di lavoro delle persone coinvolte.

    “Nella nostra ricerca abbiamo incontrato addirittura persone pagate 0,001 dollaro per ogni azione che compiono durante le loro mansioni. Sono reclutate in paesi talmente a basso reddito che per loro, purtroppo, diventa economicamente interessante svolgere questi compiti pagati male. In Venezuela, dove l’80 per cento della popolazione vive sotto la soglia di povertà e il salario medio è di sei-otto dollari al mese, arrivare a guadagnarne un po’ di più facendo microtask (traduzioni, descrizioni, tagging, sondaggi…) per l’intelligenza artificiale può in effetti rappresentare una prospettiva ed è su questo che fanno leva molte aziende come Google, la OpenAi, la Meta”.

    Si tratta di una catena di approvvigionamento molto lunga. Queste aziende subappaltano il lavoro ad altre, che di solito operano all’estero. “La filiera arriva fino in Asia, in Africa o in America Latina, dove ci sono piccole realtà informali, in cui si lavora in nero, spesso a conduzione familiare, e lì diventa difficilissimo, e a volte perfino pericoloso, investigare. Dobbiamo addentrarci in case, in internet point, in luoghi malfamati, per intervistare queste persone”, spiega Casilli.

    La ricerca di lavoratori avviene anche attraverso degli annunci online. “Vuoi aiutarci a plasmare il futuro dell’intelligenza artificiale? Abbiamo un lavoro al 100 per cento da remoto per te: non è richiesta alcuna esperienza, ma solo la volontà di imparare e contribuire al campo all’avanguardia dell’intelligenza artificiale. Che tu sia agli inizi o un professionista esperto, la nostra comunità ha un ruolo per te! Avrete l’opportunità di contribuire all’addestramento di applicazioni di ia come ia generativa, modelli linguistici di grandi dimensioni, assistenti virtuali, chatbot, motori di ricerca e molto altro ancora”. Questo è solo uno degli annunci che si trovano sui siti di ricerca lavoro per assumere addestratori di sistemi basati sull’intelligenza artificiale. L’antesignana di queste piattaforme è Amazon turk, nata come una sorta di supporto ad Amazon per mettere ordine tra i tantissimi annunci che comparivano sul sito, al caos delle descrizioni. Una bacheca globale per la ricerca di lavoro, con la possibilità di registrarsi e partecipare a queste microtask.

    “Ci siamo imbattuti in situazioni diverse, dall’addestramento dei filtri per la moderazione dei contenuti su Facebook in Kenya allo sviluppo di sindromi post-traumatiche da stress abbastanza forti, a famiglie venezuelane che si organizzano per lavorare senza fermarsi mai”, dice Casilli, raccontando alcune delle testimonianze raccolte sul campo. Certi creano delle piccole fabbriche in casa, dove la mattina lavora il padre, poi è il turno della figlia quando torna da scuola, e la sera la mamma o addirittura la nonna. In Venezuela l’elettricità costa poco, e all’epoca di Chavez era stato lanciato un programma per distribuire computer in tutte le famiglie, quindi oggi un po’ tutti possono lavorare da casa.

    Ci sono addirittura casi di false intelligenze artificiali: aziende che vendono videocamere di sorveglianza basata sull’ia a supermercati, e poi si scopre che non c’è alcuna intelligenza artificiale dietro, ma persone in Africa, pagate pochissimo, che fanno sorveglianza in tempo reale. “Abbiamo passato una settimana in una casa in Madagascar trasformata in fabbrica di dati, con lavoratori ovunque in garage, in soffitta. Erano almeno in 120 in una casa sommersa dalla spazzatura e con un bagno solo, pagati pochissimo e impiegati giorno e notte per far finta di essere un sistema di videosorveglianza basato sull’intelligenza artificiale”, racconta Casilli.

    La paga bassissima, soprattutto se paragonata ai miliardi che girano nell’indotto delle grandi aziende tecnologiche, non è l’unico dei problemi. Un aspetto sottovalutato è quello dei traumi psicologici a cui sono sottoposti i lavoratori. Si tratta spesso di compiti ripetitivi e alienanti, e in molti casi, come nella moderazione dei contenuti sui social network, si ha a che fare con contenuti tossici, violenti, sessualmente degradanti.

    E poi c’è l’instabilità. “Per i data worker uno dei problemi più sentiti, al di là delle paghe basse, è l’ansia di non avere un lavoro costante. Devono essere sempre disponibili. Non hanno alcun controllo sul salario, sul carico e sulle modalità di lavoro. I moderatori sono esposti tutto il giorno a contenuti osceni. Ci possono essere diverse conseguenze psicologiche”, spiega Simone Robutti, cofondatore della sezione berlinese e italiana della Tech workers coalition, un’organizzazione dei lavoratori del settore tecnologico nata per conquistare maggiori diritti e migliori condizioni. Molte di queste persone fanno questo lavoro perché hanno problemi di salute, non possono muoversi da casa. E quindi sono ulteriormente ricattabili, dice Robutti.

    Un altro aspetto del problema lo individua Teresa Numerico, professoressa di logica e filosofia della scienza all’università Roma Tre, secondo cui molti lavoratori firmano degli accordi di riservatezza così restrittivi che hanno addirittura paura di chiedere supporto legale o psicologico. “È per questo che si sa pochissimo di questo sottobosco lavorativo”.

    Il lavoro invisibile

    Spesso, quando si parla delle conseguenze dell’avvento delle intelligenze artificiali nel mondo del lavoro, si vede il pericolo maggiore nella sostituzione degli esseri umani da parte delle macchine. Ma Numerico sposta lo sguardo. “La conseguenza peggiore di questo processo non è tanto che l’intelligenza artificiale ha cominciato a fare il lavoro degli esseri umani, ma che ha incorporato il lavoro umano in modo tale da averlo reso invisibile. Questo produce maggiore potenziale di sfruttamento”.

    Stiamo assistendo da qualche anno alla cosiddetta piattaformizzazione del lavoro, cioè l’utilizzo delle piattaforme digitali e delle app per far incontrare domanda e offerta di lavoro. E sulle piattaforme sono impiegate persone che sono solo un’appendice delle macchine. “Questo li rende oggetto di sfruttamento. In un certo senso sono in competizione con le macchine. Si tratta di lavoratori intercambiabili. I rider sono ‘l’aristocrazia’ di questo processo, perché quantomeno si vedono”, dice Numerico.

    Nel gennaio 2023 Time ha pubblicato un’inchiesta sugli addestratori della OpenAi che guadagnavano meno di due dollari all’ora. L’azienda a cui la OpenAi aveva esternalizzato questo lavoro era la Sama di San Francisco, negli Stati Uniti, che impiega persone in Kenya, Uganda, India e altri paesi a basso reddito. Anche Google, la Meta e la Microsoft fanno così. Quelli assunti dalla Sama per conto della OpenAi erano pagati tra 1,32 e 2 dollari all’ora, a seconda dell’anzianità e delle prestazioni. Si legge nell’inchiesta: “Un lavoratore della Sama incaricato di leggere e analizzare il testo per la OpenAi ha raccontato di aver sofferto di disturbi ossessivi dopo aver letto la descrizione di un uomo che faceva sesso con un cane davanti a un bambino. ‘È stata una tortura’, ha detto”. Gli etichettatori di dati più giovani ricevevano uno stipendio di 21mila scellini kenioti (170 dollari) al mese.

    Come si vede dal racconto di Time, quando si parla di proletariato digitale esiste una divisione tra nord e sud del mondo. “Le grandi aziende che sviluppano e usano queste tecnologie hanno sede nei paesi ricchi. Ma chi completa manualmente queste attività è quasi sempre in Africa, in India. Anche perché le barriere all’ingresso sono poche: basta avere una connessione e una padronanza della lingua inglese”, spiega Aloisi. Numerico è d’accordo: c’è un tema di colonizzazione e razzializzazione.

    Come funziona in Italia

    “Anche in Italia troviamo annunci di lavoro in questo settore, che vanno dai 7 ai 15 euro l’ora”, spiega Aloisi.

    Secondo Casilli, poi, in Italia c’è qualche azienda un po’ più specializzata, per esempio nel trattamento di immagini per le radiografie e sistemi medici. “Ma la realtà è che non significa necessariamente che i lavori siano pagati meglio. L’Italia resta un paese in cui la difesa dei diritti dei lavoratori è sostanzialmente disattesa e ci sono situazioni di estrema precarietà”.

    Casilli anticipa i risultati di un’inchiesta del suo gruppo di ricerca, che sarà pubblicata tra qualche mese. I tre paesi europei più interessati dal fenomeno degli addestratori di sistemi di intelligenza artificiale sono la Spagna, il Portogallo e subito dopo l’Italia. Tanti lavoratori coinvolti, come in Italia, sono immigrati, che non hanno accesso al mercato del lavoro regolare e che trovano almeno una fonte di reddito, anche se scadente, in condizioni terribili, con addirittura il rischio enorme di non essere pagati. Sono persone che arrivano in Italia dall’Africa, dall’Asia, dal Sudamerica.

    Un’ulteriore faccia della stessa medaglia la evidenzia Numerico: “Per quanto riguarda l’addestramento in lingua italiana, spesso non è impiegato chi vive in Italia, ma chi parla italiano e vive all’estero, come per esempio in Nordafrica o in Albania”.

    Tutti questi lavoratori è come se si trovassero nel ventre di una balena e – a differenza di altri, come per esempio i rider – sono più difficili da tutelare, proprio perché invisibili. “Il primo passo è cominciare a far emergere la loro presenza, e poi avviare le lotte sindacali per trattamenti equi”, spiega Numerico, che individua nel lavoro da remoto un ostacolo, ma anche un modo per aggirarlo: la tecnologia potrebbe mettere in comunicazione queste persone dislocate in vari paesi e accomunate dal fatto di subire le stesse condizioni lavorative.

    In questo processo di piattaformizzazione il datore di lavoro scarica le proprie responsabilità, spiega Numerico. Non essendo un tipo di impiego subordinato, il datore non solo paga poco, ma non mette a disposizione dei lavoratori né i mezzi di produzione né gli spazi, e non si assume alcun rischio. Tuttavia, si prende il profitto che ne risulta. “Il lavoratore si assume tutti i rischi e deve anche pagarsi i mezzi per poter lavorare. Si crea uno spazio le cui regole sono dettate da chi detiene il controllo su quello spazio. Il datore di lavoro è evanescente”, conclude Numerico.

    Robutti della Tech workers coalition spiega che l’obiettivo dell’organizzazione è mostrare che ci si può sindacalizzare e organizzare anche nel settore della tecnologia digitale. Solo dieci anni fa non era realistico, c’erano veramente pochi esempi e nelle aziende del digitale non c’era la presenza di sindacati come in quelle tradizionali. “Ad oggi non c’è ancora un modo forte e consolidato con cui sindacalizzare i data worker. Si tratta spesso di persone che lavorano in subappalti di subappalti. Hanno pochissimo potere e per loro è molto complicato organizzarsi. Adesso che i rider hanno ottenuto molte più tutele rispetto a dieci anni fa, i sociologi del lavoro e gli accademici hanno cominciato a occuparsi dei data worker”.

    Un esempio virtuoso di sindacalizzazione lo racconta Casilli, spiegando che in Germania i sindacati sono molto attivi al fianco dei lavoratori del settore digitale già dal 2016. Anche in Kenya, dove per esempio la OpenAi ha fatto addestrare ChatGpt, ci sono grandissimi movimenti sindacali che coinvolgono i lavoratori del settore. Anche in Brasile ci sono pressioni per approvare norme che contengano misure per tutelarli. In Italia la situazione è meno rosea, conclude Casilli. “È difficile far vedere una popolazione invisibile”.

    Alessio De Luca, responsabile del Progetto lavoro 4.0 Cgil nazionale, spiega perché è così complicato anche per il sindacato tradizionale intervenire a tutela di questo tipo di professioni. Si tratta di un gruppo molto complesso, variegato ed esteso, e ogni giorno nascono e crescono una serie di nuove figure difficili da inquadrare, che lavorano direttamente con le piattaforme e non sono facilissimi da intercettare e organizzare. “Attraverso Apiqa, la nostra associazione che si occupa di lavoro autonomo, stiamo provando a stilare una serie di proposte normative. Le difficoltà maggiori riguardano l’individuazione della remunerazione e dei minimi salariali di questo ‘mondo di mezzo’. Bisognerebbe avere più strumenti possibili, a partire dai controlli: chi deve intervenire? Il garante? L’ispettorato del lavoro?”, si domanda De Luca. “Al momento stiamo immaginando proposte normative come l’equo compenso e trattamenti welfare e previdenziali. Il problema però è che si ragiona sempre dentro vecchi perimetri”.

    A livello europeo, spiega Aloisi, l’attenzione su questi fenomeni è cresciuta. Tra marzo e aprile è stata approvata ladirettiva piattaforme, la direttiva europea per il miglioramento delle condizioni dei lavoratori coinvolti nel settore e che in parte tutela i sottoproletari dei dati. La strada da seguire rimane quella dell’uscita dall’invisibilità, per poter agire e trovare soluzioni concrete al precariato e allo sfruttamento.

    https://www.internazionale.it/reportage/laura-melissari/2024/08/06/intelligenza-artificiale-lavoratori-sfruttamento
    #travail #conditions_de_travail #AI #IA #intelligence_artificielle #prolétariat #nouveau_prolétariat #data_labeling #données #soustraitance #sous-traitance #délocalisation #data_workers #travail_invisible

  • À France Travail, l’essor du contrôle algorithmique
    https://www.laquadrature.net/2024/06/25/a-france-travail-lessor-du-controle-algorithmique

    « Score de suspicion » visant à évaluer l’honnêteté des chômeur·ses, « score d’employabilité » visant à mesurer leur « attractivité », algorithmes de détection des demandeur·ses d’emploi en situation de « perte de confiance », en « besoin de redynamisation » ou encore à…

    #Données_personnelles #Surveillance

    • Au nom de la « rationalisation » de l’action publique et d’une promesse « d’accompagnement personnalisé » et de « relation augmentée », se dessine ainsi l’horizon d’un service public de l’#emploi largement automatisé. Cette automatisation est rendue possible par le recours à une myriade d’#algorithmes qui, de l’inscription au suivi régulier, se voient chargés d’analyser nos #données afin de mieux nous évaluer, nous trier et nous classer. Soit une extension des logiques de #surveillance de masse visant à un #contrôle_social toujours plus fin et contribuant à une déshumanisation de l’accompagnement social.

      De la CAF à France Travail : vers la multiplication des « scores de suspicion »

      C’est, ici encore, au nom de la « lutte contre la fraude » que fut développé le premier algorithme de profilage au sein de #France_Travail. Les premiers travaux visant à évaluer algorithmiquement l’honnêteté des personnes sans emploi furent lancés dès 2013 dans la foulée de l’officialisation par la CAF de son algorithme de notation des allocataires. Après des premiers essais en interne jugés « frustrants »1, France Travail – à l’époque Pôle Emploi – se tourne vers le secteur privé. C’est ainsi que le développement d’un outil de détermination de la probité des demandeur·ses d’emploi fut confié à Cap Gemini, une multinationale du CAC402.

      La notation des chômeur·ses est généralisée en 2018.

      #chômeurs #guerre_aux_pauvres

  • #DataSuds-geo partage les #données_géographiques de l’#IRD

    Le nouveau service DataSuds-geo est un service dédié aux données géographiques, il vient compléter l’offre de publication et de diffusion des #données_scientifiques à l’IRD.

    DataSuds-geo donne accès à plus de 790 #jeux_de_données cartographiques, déposés par des scientifiques de l’IRD ou transférés depuis la base #Sphaera (ancien #catalogue des cartes de l’IRD). Ce nouvel outil donne accès à des informations et des services spécifiques : localisation géographique, visualisation cartographique, téléchargement, etc. Les données cartographiques peuvent également être consultées hors connexion via un système d’information géographique installé sur son poste de travail.

    https://www.ird.fr/datasuds-geo-partage-les-donnees-geographiques-de-lird
    https://datasuds-geo.ird.fr/geonetwork/srv/fre/catalog.search#/home
    #données #cartes #cartographie

  • Guerra e tecnica: l’umano gesto sotto attacco
    https://radioblackout.org/2024/06/guerra-e-tecnica-lumano-gesto-sotto-attacco

    Era il 1970 quando gli Stati Uniti lanciarono l’operazione Iglù bianco: un aereo della marina lanciò decine di migliaia di microfoni per cogliere i passi dei guerriglieri, rilevatori d’attività sismica per cogliere vibrazioni minime sul terreno, sensori olfattivi per cercare l’ammoniaca presente nell’urina umana. Dispositivi di raccolta dati direttamente legati ai bombardamenti a tappeto in […]

    #L'informazione_di_Blackout #cibernetica #guerra_all'umano #guerra_totale
    https://cdn.radioblackout.org/wp-content/uploads/2024/06/guerratecnica.mp3

  • Passion opendata
    https://www.radiofrance.fr/franceinter/podcasts/la-question-qui/la-question-qui-du-lundi-27-mai-2024-6844464

    Interview de Samuel Goëta sur France Inter par Marie Misset.

    Saviez-vous que chaque citoyen a le droit d’accès à la fiche de paye d’Emmanuel Macron ? Notre invité, fervent défenseur de l’open data, a créé une plateforme visant à simplifier la consultation de documents administratifs par les individus, car il s’agit selon lui d’un enjeu démocratique majeur.


    https://cfeditions.com/donnees-democratie

    #Samuel_Goëta #Données_démocratie #Open_data

  • Parlez-moi d’IA n°32 avec Samuel Goëta pour évoquer l’Open Data et son livre Les données de la Démocratie | LinkedIn
    https://www.linkedin.com/feed/update/urn:li:activity:7197861061738332160/?actorCompanyId=25511245

    Parlez-moi d’IA n°32 avec Samuel Goëta pour évoquer l’Open Data et son livre Les données de la Démocratie ⤵⤵⤵

    Cette semaine, nous nous demandons quels sont les enjeux de pouvoirs et de contre-pouvoirs autour de l’Open Data avec Samuel Goëta, l’auteur d’un tout nouvel ouvrage de référence sur le sujet, « Les données de la Démocratie ».

    Samuel Goëta connaît bien ce sujet puisqu’il les suit depuis 2008, il était encore étudiant puis doctorant quand il a commencé à suivre cet objet d’études. Il est désormais enseignant à ScPo Aix-en-Provence, activiste de la donnée et consultant dans ce domaine.

    Alors Open Data, késako ? Open Data ou données ouvertes. C’est l’idée de publier, de rendre accessible, au plus grand nombre, sans restriction, des données sur un sujet. Mais pourquoi faire ?

    L’article 15 de la déclaration des droits de l’homme et du citoyen du 26 août 1789 disait déjà « La Société a le droit de demander compte à tout Agent public de son administration ». Le 1er objectif de l’Open Data reprend cet article et fait de la transparence des organisations et notamment de l’administration un principe clé.

    Autre pilier de l’Open Data, la participation ou la collaboration autour des données. Forcément si elles sont ouvertes, nous sommes plus nombreux à pouvoir les utiliser et travailler ensemble sur le sujet couvert.

    Sur le papier, tout cela est bien joli mais comment cela s’est mis en place en France pour que fin 2023, notre pays soit classé 1er pays européen dans le classement de l’Open Data Maturity Report 2023 et 2e pays au monde dans le OURdata Index de l’OCDE.

    Tout cela ne s’est pas fait en un jour. Cela reste encore fragile et pose encore de nombreuses questions vis-à-vis de notre administration, de notre économie et de notre démocratie. Autant de questions que nous avons abordé avec Samuel Goëta

    Programmation musicale : JPC
    Cheers de Victoria Flavian
    Réalisation : Jérôme Sorrel / Montage final et mise en ligne : Olivier Grieco.

    #Samuel_Goeta #Données_démocratie #Podcast #Radio_Cause_Commune

  • Recension du livre « Les données de la démocratie » de Samuel Goëta | LinkedIn
    https://www.linkedin.com/feed/update/urn:li:activity:7184452712401371136

    Recension du livre « Les données de la démocratie » de Samuel Goëta.

    Elsa Foucraut • Consultante • Enseignante • Autrice du "Guide du Plaidoyer" (éditions Dunod)

    💡"L’ouverture des données en France, rendue obligatoire par la loi pour une République numérique, ressemble à bien des égards à la tour de Pise. Ce superbe édifice, qui attire les regards du monde entier, doit son inclinaison à des fondations instables.”

    📚 Lecture du jour : "Les données de la démocratie" de Samuel Goëta (C & F EDITIONS), avec une préface de l’ancienne ministre du numérique Axelle Lemaire.

    Depuis 2016, la loi impose à toutes les administrations françaises le principe de l’open data par défaut. Pourtant, quiconque a déjà eu besoin d’accéder à des données d’intérêt général non publiques mesure à quel point le principe est loin d’être pleinement appliqué (euphémisme). De vraies fractures territoriales se creusent entre les collectivités, accéder aux données relève souvent du saut d’obstacles, et les données disponibles sur les portails open data des collectivités sont parfois décevantes.

    Le livre revient d’abord en détail sur l’histoire de l’open data : des origines aux étapes plus récentes, avec la création d’Etalab et la loi pour une République numérique. Une belle synthèse, avec plein de références et d’anecdotes que je ne connaissais pas. Il manquait justement un livre de référence pour retracer cette histoire contemporaine, avec le versant français.

    Mais ce sont les chapitres sur le bilan critique de l’open data, qui m’ont semblé les plus intéressants. Certes, les obstacles à l’open data tiennent parfois à la résistance consciente de certains agents (un "DataBase Hugging Disorder" ou DBHD 😅).

    Mais Samuel Goëta montre que cela tient le plus souvent à des frictions organisationnelles et/ou techniques : les agents n’ont pas toujours le mandat pour ouvrir des données qui pourraient servir l’opposition ou remettre en question l’action des élus ; de nombreux projets d’open data sont conçus comme des projets servant d’abord l’image de l’institution, sans les moyens de leur ambition ; la mauvaise qualité des données peut s’expliquer par le fait que les données d’une administration étaient d’une qualité suffisante pour un usage interne et n’avaient pas été envisagées pour un usage externe ; les problèmes techniques sont fréquents et l’ouverture se transforme parfois en une véritable enquête dans les serveurs quand les techniciens n’ont pas accès aux "schémas de base" ; et les agents sont parfois réticents parce qu’ils anticipent des usages malveillants des données, ou bien ont une interprétation excessive du RGPD (un grand classique !).

    L’ouvrage préconise notamment une refonte du droit d’accès aux données : transférer à la CNIL l’arbitrage de l’accès aux données individuelles traitées par les acteurs publics ; intégrer la Cada dans la HATVP ; donner à la CADA le pouvoir de sanctionner les administrations récalcitrantes ; créer un “référé communication” pour accélérer le traitement des demandes. Pourquoi pas !

    ➡ "Faut tout donner afin de changer les données" : une citation de NTM (oui oui) qui sert de conclusion au livre ;-)

    #Samuel_Goëta #Open_data #Données_démocratie

  • Policing migration: when “harm reduction” means “multipurpose aerial surveillance”

    The EU’s latest “#operational_action_plan” on migrant smuggling gives a central role to #Europol, which will receive data resulting from more than two dozen joint police operations launched by EU member states, EU agencies and a range of non-EU states. The UK is heavily involved in the plan, and is leading one activity. One objective is for harm reduction and assistance to victims, but the only activity foreseen is for Frontex to increase use of its “#EUROSUR_Fusion_Services, including the #Multipurpose_Aerial_Surveillance aircraft service.”


    Police against people smuggling

    The action plan (pdf) covers the 2024-25 period and contains an outline of 25 activities listed under eight strategic goals, but offers no insight into the causes of human smuggling, and none of the activities are framed at addressing causes.

    The overall aim is to control migration flows both into the EU and within the EU, and to enhance police cooperation between national law enforcement authorities, EU agencies (Europol, #Frontex and the #EU_police_database_agency, #eu-LISA) and with countries outside the EU, through joint operations and the exchange of information and intelligence.

    Many of the activities include targets for arrests: one led by Poland, for example, foresees the arrest of 200 facilitators of irregular migration per year; another, led by Cyprus, expects at least 1,000 “apprehensions/arrests”.

    In 2015, Statewatch exposed a planned EU-wide police operation against irregular migrants called ‘Mos Maiorum’, which led to significant media coverage and political controversy, as well as numerous actions to inform people of their rights and to try to map police activities. Since then, the number of such operations has skyrocketed, but attention has dwindled.

    European plan

    The 2024-25 plan is part of the #European_Multidisciplinary_Platform_Against_Criminal Threats, a now-permanent initiative (https://www.statewatch.org/statewatch-database/eu-joint-police-operations-target-irregular-migrants-by-chris-jones) through which joint police operations are coordinated. It is managed by Europol, with political control exercised by the member states in the Council of the EU.

    A “leader” is assigned to each activity in the action plan, responsible for initiating and reporting on the relevant activity, with “key performance indicators” often indicated in respect of each one.

    The leaders include nine EU member states (Austria, Cyprus, France, Germany, Greece, Italy, Poland, Portugal and Spain), the UK, as well Frontex, Europol, eu-LISA and the European Police College (CEPOL).

    Europol will provide overall support across all the different activities and is specifically responsible for leading four activities.

    In many activities led by national police forces, it is specified that a goal is also to participate in other Europol initiatives, such as the “Europol Cyberpatrol to target and identify targets” and Europol’s European Migrant Smuggling Centre. The Operational Action Plan stipulates that other, unspecified, “Europol tools” may be used “where appropriate”.

    The action plan specifies that the operational data emanating from the activities is to be shared with Europol to be processed through its Analysis Projects, further swelling the databases at its headquarters in The Hague.

    The first version of the action plan was circulated amongst member states two weeks before the European Commission published a proposal to reinforce Europol’s powers in relation to migrant smuggling, arguing that they were urgently needed – though this assessment was not shared by the member states.

    Strategic goals

    The 26 activities outlined in the plan are designed to contribute to eight strategic goals:

    - Criminal intelligence picture. The activities under this heading are for Europol to provide a “situational picture of migrant smuggling” including threat assessments, updates on migratory routes, “modi operandi” and future trends, which will be made available to member states and third countries. It will involve sharing information with Frontex. Europol also aims to “strengthen the strategic and tactical intelligence picture on the use/abuse of legal business structures by criminal networks” not only in respect of migrant smuggling, but throughout “all main crime areas affecting the EU”.
    - Investigations and judicial response. There are 11 activities planned in relation to this goal. The objective is to prepare and conduct investigations and prosecutions. Police forces of different member states lead the activities and set out specific targets by reference to the numbers of arrests, initiated investigations and identified networks. Each planned activity appears to reflect specific national or local police force priorities. Germany for instance aims to “detect 5,000 irregular migrants” per year, and arrest 500 “facilitators”, whilst France focuses on seizing 100 small boats crossing the Channel to the UK. Spain focuses on air routes, including links between human smuggling and drug trafficking; and Portugal’s aim is to disrupt “marriages of convenience abuse and associated threats” (400 cases specifically). Europol also leads an activity aimed at the development of “intelligence products in support of MS investigations” (50 per year) and Frontex aims to focus on border checks and surveillance measures on the EU external borders (with 1,000 “apprehensions/arrests”).
    - Coordinated controls and operations targeting the online and offline trade in illicit goods and services. The only activity planned in relation to this goal is by the French police forces, to improve law enforcement response against “those utilising the Dark Web and other internet messenger applications to enable illegal immigration and document fraud”. The dark web is identified as an “intelligence gap” in this context.
    - Criminal finances, money laundering and asset recovery. Led by the UK, the activity planned under this goal heading is to disrupt money flows specifically within hawaladar networks.
    - Document fraud. Frontex, as well as French and German police forces each lead activities under this goal aimed at “targeting networks or individuals” involved in document fraud. In this respect, Frontex’s “Centre of Excellence for Combating Document Fraud” has a key role.
    – Capacity building through training, networking and innovation. This involves activities aimed at improving the skills, knowledge and expertise of law enforcement and judicial authorities, led by CEPOL, eu-LISA (on the use of SIS and Eurodac databases) and German police forces.
    - Prevention, awareness raising and harm reduction, as well as early identification and assistance to victims. The only goal that is expressed as being aimed at improving the safety of people is led by Frontex, and is focused on the detection of migrant smuggling through the “use of EUROSUR Fusion Services, including the Multipurpose Aerial Surveillance aircraft service, for [member states] and stakeholders to support more effective detecting, preventing and combating illegal immigration and migrant smuggling.” No mention is made of identification, assistance or victims.
    - Cooperation with non EU partners: under this last goal, one activity is led by Austrian police forces, aimed at expanding the geographical focus of the Task Force Western Balkans to Turkey and “other relevant countries of origin and transit”. The work is already based on intelligence information provided by Europol and Frontex and aims to “enhance mobile phone extractions” (the link here is not clear). The second activity listed under this last goal is led by Europol, and aims to provide a “common platform for EU agencies, military, law enforcement and other stakeholders to exchange intelligence on criminal networks operating along the migration corridors”, creating a broad and focal role for itself in information exchange with a wide range of stakeholders, including private companies.

    For the purposes of the Operational Action Plan, “migrant smuggling” is broadly defined as:

    “…the process of facilitating the unlawful entry, transit or residence of an individual in a country with or without obtaining financial or other benefits. Migrant smuggling entails the facilitation of illegal entry to the EU and of secondary movements within the EU. It can also involve facilitating the fraudulent acquisition of a residence status in the EU.”

    It therefore does not require the involvement of any benefit and includes movements within the EU.

    https://www.statewatch.org/news/2024/april/policing-migration-when-harm-reduction-means-multipurpose-aerial-surveil
    #surveillance #surveillance_aérienne #migrations #réfugiés #données #coopération_policière #European_Police_College (#CEPOL) #European_Migrant_Smuggling_Centre #Europol_Cyberpatrol

  • The Hellenic Data Protection Authority fines the Ministry of Migration and Asylum for the “Centaurus” and “Hyperion” systems with the largest penalty ever imposed to a Greek public body

    Two years ago, in February 2022, Homo Digitalis had filed (https://homodigitalis.gr/en/posts/10874) a complaint against the Ministry of Immigration and Asylum for the “#Centaurus” and “#Hyperion” systems deployed in the reception and accommodation facilities for asylum seekers, in cooperation with the civil society organizations Hellenic League for Human Rights and HIAS Greece, as well as the academic Niovi Vavoula.

    Today, the Hellenic Data Protection Authority identified significant GDPR violations in this case by the Ministry of Immigration and Asylum and decided to impose a fine of €175.000 euro – the highest ever imposed against a public body in the country.

    The detailed analysis of the GDPR highlights the significant shortcomings that the Ministry of Immigration and Asylum had fallen into in the context of preparing a comprehensive and coherent Data Protection Impact Assessment, and demonstrates the significant violations of the GDPR that have been identified and relate to a large number of subjects who have a real hardship in being able to exercise their rights.

    Despite the fact that the DPA remains understaffed, with a reduced budget, facing even the the risk of eviction from its premises, it manages to fulfil its mission and maintain citizens’ trust in the Independent Authorities. It remains to be seen how long the DPA will last if the state does not stand by its side.

    Of course, nothing ends here. A high fine does not in itself mean anything. The Ministry of Immigration and Asylum must comply within 3 months with its obligations. However, the decision gives us the strength to continue our actions in the field of border protection in order to protect the rights of vulnerable social groups who are targeted by highly intrusive technologies.

    You can read our press release here: https://homodigitalis.gr/wp-content/uploads/2024/04/PressRelease_%CE%97omoDigitalis_Fine-175.000-euro_Hellenic_Data_Protec

    You can read Decision 13/2024 on the Authority’s website here: https://www.dpa.gr/el/enimerwtiko/prakseisArxis/aytepaggelti-ereyna-gia-tin-anaptyxi-kai-egkatastasi-ton-programmaton

    https://homodigitalis.gr/en/posts/132195

    #Grèce #surveillance #migrations #réfugiés #justice #amende #RGDP #données #protection_des_données #camps_de_réfugiés #technologie

    • Griechenland soll Strafe für Überwachung in Grenzcamps zahlen

      Wie weit darf die EU bei der Überwachung von Asylsuchenden an ihren Grenzen gehen? Griechenland testet das in neuen Lagern auf den Ägäischen Inseln. Nun hat die griechische Datenschutzbehörde dafür eine Strafe verhängt. Bürgerrechtler:innen hoffen auf eine Entscheidung mit Signalwirkung.

      Doppelter „Nato-Sicherheitszaun“ mit Stacheldraht. Kameras, die selbst den Basketballplatz und die Gemeinschaftsräume rund um die Uhr im Blick haben. Drohnen sorgen für Überwachung aus der Luft. Das Lager auf Samos, das die griechische Regierung 2021 mit viel Getöse eröffnet hat, gleicht eher einem Gefängnis als einer Erstaufnahme für Asylsuchende, die gerade in Europa gelandet sind.

      Das Überwachungssystem, das in diesem und vier weiteren Lagern auf den griechischen Inseln für „Sicherheit“ sorgen soll, heißt Centaurus. Die Bilder aus den Sicherheitskameras und Drohnen laufen in einem Kontrollzentrum im Ministerium in Athen zusammen. Bei besonderen Situationen sollen auch Polizeibehörden oder die Feuerwehr direkten Zugang zu den Aufnahmen bekommen. Mit dem System Hyperion wird der Zugang zum Lager kontrolliert: biometrischen Eingangstore, die sich nur mit Fingerabdrücken öffnen lassen.

      Für den Einsatz dieser Technologien ohne vorherige Grundrechtsprüfung hat das Ministerium nun eine Strafe kassiert. Die griechische Datenschutzaufsicht sieht einen Verstoß gegen Datenschutzgesetze in der EU (DSGVO). In einem lang erwarteten Beschluss belegte sie vergangene Woche das Ministerium für Migration und Asyl mit einem Bußgeld von 175.000 Euro.
      Erst eingesetzt, dann Folgen abgeschätzt

      Zwei konkrete Punkte führten laut Datenschutzbehörde zu der Entscheidung: Das Ministerium hat es versäumt, rechtzeitig eine Datenschutz-Folgenabschätzung zu erstellen. Gemeint ist damit eine Bewertung, welche Auswirkungen der Einsatz der Überwachung auf die Grundrechte der betroffenen Personen hat. Es geht um die Asylsuchenden, die in den Lagern festgehalten werden, aber auch Angestellte, Mitarbeitende von NGOs oder Gäste, die das Lager betreten.

      Eine solche Abschätzung hätte bereits vor der Anschaffung und dem Einsatz der Technologien vollständig vorliegen müssen, schreibt die Aufsichtsbehörde in ihrer Entscheidung. Stattdessen ist sie bis heute unvollständig: Ein Verstoß gegen die Artikel 25 und 35 der Datenschutzgrundverordnung, für die die Behörde eine Geldbuße in Höhe von 100.000 Euro verhängt.

      Zusätzlich wirft die Behörde dem Ministerium Intransparenz vor. Dokumente hätten beispielsweise verwirrende und widersprüchliche Angaben enthalten. Verträge mit den Unternehmen, die die Überwachungssysteme betreiben, hätte das Ministerium mit Verweis auf Geheimhaltung gar nicht herausgegeben, und damit auch keine Details zu den Bedingungen, zu denen die Daten verarbeitet werden. Wie diese Systeme mit anderen Datenbanken etwa zur Strafverfolgung verknüpft sind, ob also Aufnahmen und biometrische Daten auch bei der Polizei landen könnten, das wollte das Ministerium ebenfalls nicht mitteilen. Dafür verhängte die Aufsichtsbehörde weitere 75.000 Euro Strafe.
      Ministerium: Systeme noch in der Testphase

      Das Ministerium rechtfertigt sich: Centaurus und Hyperion seien noch nicht vollständig in Betrieb, man befinde sich noch in der Testphase. Die Aufsichtsbehörde habe nicht bedacht, dass „die Verarbeitung personenbezogener Daten nicht bewertet werden konnte, bevor die Systeme in Betrieb genommen wurden“. Hinzu kämen Pflichten zur Geheimhaltung, die sich aus den Verträgen mit den Unternehmen hinter den beiden Systemen ergeben.

      Die Behörde hat das nicht durchgehen lassen: Rein rechtlich mache es keinen Unterschied, ob ein System noch getestet wird oder im Regelbetrieb sei, schriebt sie in ihrer Entscheidung. Die Abschätzung hätte weit vorher, nämlich bereits bei Abschluss der Verträge, vorliegen müssen. Noch dazu würden diese Verstöße eine große Zahl an Menschen betreffen, die sich in einer besonders schutzlosen Lage befänden.

      Abschalten muss das Ministerium die Überwachungssysteme allerdings nicht, sie bleiben in Betrieb. Es muss lediglich binnen drei Monaten den Forderungen nachkommen und fehlende Unterlagen liefern. Das Ministerium kündigt an, die Entscheidung rechtlich überprüfen und möglicherweise anfechten zu wollen.
      Geheimhaltungspflicht keine Ausrede

      „Die Entscheidung ist sehr wichtig, weil sie einen sehr hohen Standard dafür setzt, wann und wie eine Datenschutz-Folgenabschätzung erfolgreich durchgeführt werden muss, sogar vor der Auftragsvergabe“, sagt Eleftherios Helioudakis. Er ist Anwalt bei der griechischen Organisation Homo Digitalis und beschäftigt sich mit den Auswirkungen von Technologien auf Menschenrechte. Eine Beschwerde von Homo Digitalis und weiteren Vereinen aus dem Jahr 2022 hatte die Untersuchung angestoßen.

      Helioudakis sagt, die Entscheidung mache deutlich, dass mangelnde Kommunikation mit der Datenschutzbehörde zu hohen Geldstrafen führen kann. Außerdem sei nun klar: Das Ministerium kann Vertragsklauseln zum Datenschutz nicht aus Gründen der Geheimhaltung vor der Datenschutzbehörde verbergen, denn für deren Untersuchungen ist die Geheimhaltungspflicht aufgehoben – wie es die DSGVO vorsieht. Das Urteil der Behörde beziehe sich zudem erst mal nur auf die Mängel bei der Einführung der Systeme, so der Bürgerrechtler. Es könnten also neue Fälle bei der Datenschutzbehörde anhängig gemacht werden.

      Die Sanktionen sind laut der Hilfsorganisation Hias die höchsten, die die Datenschutzbehörde je gegen den griechischen Staat verhängt hat. In der Summe fallen die Strafzahlungen allerdings gering aus. Sind die Datenschutzregeln der EU wirklich das geeignete Instrument, um die Rechte von Asylsuchenden zu schützen? Eleftherios Helioudakis sagt ja. „Die gesetzlichen Bestimmungen der Datenschutz-Grundverordnung sind Instrumente, mit denen wir die Bestimmungen zum Schutz personenbezogener Daten praktisch durchsetzen können.“ Es gebe keine richtigen und falschen Ansätze. „Wir können die uns zur Verfügung stehenden juristischen Instrumente nutzen, um unsere Strategie zu bündeln und uns gegen übergriffige Praktiken zu wehren.“

      Die Lager auf den Ägäischen Inseln werden vollständig von der EU finanziert und gelten als „Modell“. Nach ihrem Vorbild plant die EU in den kommenden Jahren weitere Lager an ihren Außengrenzen zu errichten. Die Entscheidung der griechischen Datenschutzaufsicht wird von der Kommission vermutlich mit Interesse verfolgt. Sie macht deutlich, unter welchen Voraussetzungen Überwachungstechnologien in diesen Camps eingesetzt werden können.

      https://netzpolitik.org/2024/panopticon-fuer-gefluechtete-griechenland-soll-strafe-fuer-ueberwachung

  • El Paso Sector Migrant Death Database

    The migrant death database published here is an attempt to address the lack of comprehensive, transparent, and publicly available migrant death data for New Mexico, El Paso and border-wide. The accessibility of this information is essential to understanding and preventing death and disappearance in the US/Mexico borderlands.

    The data for this project was collected from the New Mexico Office of the Medical Investigator (OMI), US Customs and Border Protection (CBP), the New Mexico Department of Transportation (NMDOT), the El Paso County Office of the Medical Examiner (EPCOME), Hudspeth County Justices of the Peace District 1 and 2, the International Organization for Migration’s Missing Migrant Project, independent news sources, and statements from the Sunland Park Fire Department, as well as direct observation by volunteers in the field.


    https://www.elpasomigrantdeathdatabase.org
    #USA #Mexique #base_de_données #décès #mourir_aux_frontières #morts_aux_frontières #Etats-Unis #données #cartographie #visualisation #rapport

    ping @reka @fil