• China proposes ‘Global Initiative on Data Security’ forbidding stuff it and Huawei are accused of doing already • The Register
    https://www.theregister.com/2020/09/08/china_global_initiative_on_data_security

    Simon Sharwood, APAC Editor Bio Email Twitter

    China has proposed a “Global Initiative on Data Security” that it hopes the world will adopt to govern the collection and use of data by governments and the private sector alike.

    The code was revealed today in a speech by state councilor and foreign minister Wang Yi at an event called the International Seminar on Global Digital Governance. China has only ten state councilors and the body is analogous to the Cabinet in a democracy, which we mention to indicate that Yi has gravitas and authority – China did not assign the enunciation of this idea to a lowly functionary.

    Yi outlined an eight-point code that China hopes the world will adopt. The elements of the plan are:

    Approach data security with an objective and rational attitude, and maintain an open, secure and stable global supply chain.
    Oppose using ICT activities to impair other States’ critical infrastructure or steal important data.
    Take actions to prevent and put an end to activities that infringe upon personal information, oppose abusing ICT to conduct mass surveillance against other States or engage in unauthorized collection of personal information of other States.
    Ask companies to respect the laws of host countries, desist from coercing domestic companies into storing data generated and obtained overseas in one’s own territory.
    Respect the sovereignty, jurisdiction and governance of data of other States, avoid asking companies or individuals to provide data located in other States without the latter’s permission.
    Meet law enforcement needs for overseas data through judicial assistance or other appropriate channels.
    ICT products and services providers should not install backdoors in their products and services to illegally obtain user data.
    ICT companies should not seek illegitimate interests by taking advantage of users’ dependence on their products.

    Yi said the plan is needed because the world economy’s move to online activity has increased data security challenges that “ … have put national security, public interests and personal rights at stake, and posed new challenges to global digital governance.”

    The resulting inconsistent national laws “pushed up the compliance costs for global businesses,” he complained, before suggesting “To reduce the deficit in global digital governance, countries face a pressing need to step up communication and coordination, build up mutual trust and deepen cooperation with one another.”

    Some sections of Yi’s speech seem designed to address the allegation that Chinese firms are beholden to the nation’s government. “We have not and will not ask Chinese companies to transfer data overseas to the government in breach of other countries’ laws,” Yi said.

    “I hope the Chinese initiative will serve as a basis for international rules-making on data security and mark the start of a global process in this area,” Yi said. “We look forward to the participation of national governments, international organizations and all other stakeholders, and call on States to support the commitments laid out in the Initiative through bilateral or regional agreements. We are also open-minded to good ideas and suggestions from all sides.
    Comment

    Some of Yi’s remarks will be well-received: his second point is close to the goals of the Global Commission on the Stability of Cyberspace (GCSC). But his sixth point, the call to “Meet law enforcement needs for overseas data through judicial assistance or other appropriate channels” is tricky given it could impinge on sovereignty.

    The call for an end to corporate espionage may also ring a little hollow, at least if western intelligence agencies are to be believed.

    At the time of writing The Register has not encountered any responses to China’s proposal. And of course anything said in the next 24 hours is irrelevant, as Yi’s call for regional or bilateral agreements to adopt China’s plan will require extensive negotiations.

    As did the China-USA no-hack-pact of 2015, which was quickly seen as not much more than tawdry security theatre as both nations continued to probe each other whenever deemed necessary and failed to prevent the Trump administration later creating its “Clean Network” plan on grounds that all of China’s technology companies represent a national security risk.

    #Chine

  • Microsoft : After we said we’ll try to promote more Black people, the US govt accused us of discrimination
    https://www.theregister.com/2020/10/07/microsoft_diversity_allegation

    Microsoft : After we said we’ll try to promote more Black people, the US govt accused us of discrimination Dept of Labor demands proof Windows giant isn’t making ’illegal race-based decisions’ in diversity pushAfter Microsoft vowed to double its number of Black and African American bosses and senior staffers, the US government challenged the policy as potentially racist, it was revealed Tuesday. The Windows giant went public to say it received a letter last week from the Department of Labor (...)

    #Microsoft #racisme #discrimination #GigEconomy #travail

  • Facebook apologizes to users, businesses for Apple’s monstrous efforts to protect its customers’ privacy
    https://www.theregister.com/2020/08/27/facebook_ios_ads

    New iOS update will rob people of personalized ads, wails antisocial giant The problem began when the exam regulator lost sight of the ultimate goal—and pushed for standardization above all else. Facebook has apologized to its users and advertisers for being forced to respect people’s privacy in an upcoming update to Apple’s mobile operating system – and promised it will do its best to invade their privacy on other platforms. The antisocial network that makes almost all of its revenue from (...)

    #Apple #Facebook #iOS #BigData #publicité #marketing

    ##publicité

  • MIT apologizes, permanently pulls offline huge dataset that taught AI systems to use racist, misogynistic slurs • The Register
    https://www.theregister.com/2020/07/01/mit_dataset_removed

    The dataset holds more than 79,300,000 images, scraped from Google Images, arranged in 75,000-odd categories. A smaller version, with 2.2 million images, could be searched and perused online from the website of MIT’s Computer Science and Artificial Intelligence Lab (CSAIL). This visualization, along with the full downloadable database, were removed on Monday from the CSAIL website after El Reg alerted the dataset’s creators to the work done by Prabhu and Birhane.

    The key problem is that the dataset includes, for example, pictures of Black people and monkeys labeled with the N-word; women in bikinis, or holding their children, labeled whores; parts of the anatomy labeled with crude terms; and so on – needlessly linking everyday imagery to slurs and offensive language, and baking prejudice and bias into future AI models.
    Screenshot from the MIT AI training dataset

    A screenshot of the 2.2m dataset visualization before it was taken offline this week. It shows some of the dataset’s examples for the label ’whore’, which we’ve pixelated for legal and decency reasons. The images ranged from a headshot photo of woman and a mother holding her baby with Santa to porn actresses and a woman in a bikini ... Click to enlarge

    Antonio Torralba, a professor of electrical engineering and computer science at CSAIL, said the lab wasn’t aware these offensive images and labels were present within the dataset at all. “It is clear that we should have manually screened them,” he told The Register. “For this, we sincerely apologize. Indeed, we have taken the dataset offline so that the offending images and categories can be removed.”

    In a statement on its website, however, CSAIL said the dataset will be permanently pulled offline because the images were too small for manual inspection and filtering by hand. The lab also admitted it automatically obtained the images from the internet without checking whether any offensive pics or language were ingested into the library, and it urged people to delete their copies of the data:

    “The dataset contains 53,464 different nouns, directly copied over from WordNet," Prof Torralba said referring to Princeton University’s database of English words grouped into related sets. “These were then used to automatically download images of the corresponding noun from internet search engines at the time, using the available filters at the time, to collect the 80 million images.”

    WordNet was built in the mid-1980s at Princeton’s Cognitive Science Laboratory under George Armitage Miller, one of the founders of cognitive psychology. “Miller was obsessed with the relationships between words,” Prabhu told us. “The database essentially maps how words are associated with one another.”

    For example, the words cat and dog are more closely related than cat and umbrella. Unfortunately, some of the nouns in WordNet are racist slang and insults. Now, decades later, with academics and developers using the database as a convenient silo of English words, those terms haunt modern machine learning.

    “When you are building huge datasets, you need some sort of structure,” Birhane told El Reg. “That’s why WordNet is effective. It provides a way for computer-vision researchers to categorize and label their images. Why do that yourself when you could just use WordNet?”

    WordNet may not be so harmful on its own, as a list of words, though when combined with images and AI algorithms, it can have upsetting consequences. “The very aim of that [WordNet] project was to map words that are close to each other,” said Birhane. "But when you begin associating images with those words, you are putting a photograph of a real actual person and associating them with harmful words that perpetuate stereotypes.”

    The fraction of problematic images and labels in these giant datasets is small, and it’s easy to brush them off as anomalies. Yet this material can lead to real harm if they’re used to train machine-learning models that are used in the real world, Prabhu and Birhane argued.

    “The absence of critical engagement with canonical datasets disproportionately negatively impacts women, racial and ethnic minorities, and vulnerable individuals and communities at the margins of society,” they wrote in their paper.

    #Intelligence_artificielle #Images #Reconnaissance_image #WordNet #Tiny_images #Deep_learning

  • 25 years of PHP: The personal web tools that ended up everywhere • The Register
    https://www.theregister.com/2020/06/08/25_years_of_php

    Feature On 8th June 1995 programmer Rasmus Lerdorf announced the birth of “Personal Home Page Tools (PHP Tools)”.

    The PHP system evolved into one that now drives nearly 80 per cent of websites using server-side programming, according to figures from w3techs.

    Well-known sites running PHP include every Wordpress site (WordPress claims to run “35 per cent of the web”), Wikipedia and Facebook (with caveats - Facebook uses a number of languages including its own JIT-compiled version of PHP called HHVM). PHP is also beloved by hosting companies, many of whom provide their customers with PHPMyAdmin for administering MySQL databases.

    Lerdorf was born in Greenland and grew up in Denmark and Canada. He worked at Yahoo! (a big PHP user) and Etsy. He developed PHP for his own use. “In 1993 programming the web sucked,” he said in a 2017 talk.

    It was CGI written in C and “you had to change the C code and recompile even for a slight change,” he said.

    Perl was “slightly better”, Lerdorf opined, but “you still had to write Perl code to spit out HTML. I didn’t like that at all. I wanted a simple templating language that was built into the web server.”

    The Danish-Canadian programmer’s original idea was that developers still wrote the bulk of their web application in C but “just use PHP as the templating language.” However nobody wanted to write C, said Lerdorf, and people “wanted to do everything in the stupid little templating language I had written, all their business logic.”

    Lerdorf described a kind of battle with early web developers as PHP evolved, with the developers asking for more and more features while he tried to point them towards other languages for what they wanted to do.

    “This is how we got PHP,” he said, “a templating language with business logic features pushed into it.”
    The web’s workhorse

    Such is the penetration of PHP, which Lerdorf said drives around 2 billion sites on 10 million physical machines, that improving efficiency in PHP 7 had a significant impact on global energy consumption. Converting the world from PHP 5.0 to PHP 7 would save 15B Kilowatt hours annually and 7.5B KG less carbon dioxide emissions he said – forgetting perhaps that any unused cycles would soon be taken up by machine learning and AI algorithms.

    PHP is the workhorse of the web but not fashionable. The language is easy to use but its dynamic and forgiving nature makes it accessible to developers of every level of skill, so that there is plenty of spaghetti code out there, quick hacks that evolved into bigger projects. In particular, early PHP code was prone to SQL injection bugs as developers stuffed input from web forms directly into SQL statements, or other bugs and vulnerabilities thanks to a feature called register_globals that was on by default and which will “inject your scripts with all sorts of variables,” according to its own documentation.

    There was originally no formal specification for PHP and it is still described as work in progress. It is not a compiled language and object orientation was bolted on rather than being designed in from the beginning as in Java or C# or Ruby. Obsolete versions of PHP are commonplace all over the internet on the basis that as long as it works, nobody touches it. It has become the language that everyone uses but nobody talks about.

    “PHP is not very exciting and there is not much to it,” said Lerdorf in 2002.

    Regularly coming fourth as the most popular language in the Redmonk rankings, PHP has rarely got a mention in the analysis.

    That said, PHP has strong qualities that account for its popularity and longevity. Some of this is to do with Lerdorf himself, who has continued to steer PHP with wisdom and pragmatism, though it is a community project with thousands of contributors. It lacks corporate baggage and has always been free and open source. “The tools are in the public domain distributed under the GNU Public License. Yes, that means they are free!” said Lerdorf in the first announcement.

    The documentation site is a successful blend of reference and user contributions which means you almost always find something useful there, which is unusual. Most important, PHP is reliable and lightweight, which means it performs well in real-world use even if it does not win in benchmarks.

    25 years is a good run and it is not done yet. ®

    #Histoire_numérique #Languages_informatiques #PHP