industryterm:software bugs

  • #Netflix finishes its massive migration to the Amazon cloud | Ars Technica (article de février 2016)
    https://arstechnica.com/information-technology/2016/02/netflix-finishes-its-massive-migration-to-the-amazon-cloud

    Netflix declined to say how much it pays Amazon, but says it expects to “spend over $800 million on technology and development in 2016,” up from $651 million in 2015. Netflix spends less on technology than it does on marketing, according to its latest earnings report.

    Netflix’s Simian Army

    The big question on your mind might be this: What happens if the #Amazon cloud fails?

    That’s one reason it took Netflix seven years to make the shift to Amazon. Instead of moving existing systems intact to the cloud, Netflix rebuilt nearly all of its software to take advantage of a cloud network that “allows one to build highly reliable services out of fundamentally unreliable but redundant components,” the company says. To minimize the risk of disruption, Netflix has built a series of tools with names like “Chaos Monkey,” which randomly takes virtual machines offline to make sure Netflix can survive failures without harming customers. Netflix’s “Simian Army” ramped up with Chaos Gorilla (which disables an entire Amazon availability zone) and Chaos Kong (which simulates an outage affecting an entire Amazon region and shifts workloads to other regions).

    Amazon’s cloud network is spread across 12 regions worldwide, each of which has availability zones consisting of one or more data centers. Netflix operates primarily in the Northern Virginia, Oregon, and Dublin regions, but if an entire region goes down, “we can instantaneously redirect the traffic to the other available ones,” Izrailevsky said. “It’s not that uncommon for us to fail over across regions for various reasons.”

    Years ago, Netflix wasn’t able to do that, and the company suffered a streaming failure on Christmas Eve in 2012, when it was operating in just one Amazon region. “We’ve invested a lot of effort in disaster recovery and making sure no matter how big a failure that we’re able to bring things back from backups,” he said.

    Netflix has multiple backups of all data within Amazon.

    “Customer data or production data of any sort, we put it in distributed databases such as Cassandra, where each data element is replicated multiple times in production, and then we generate primary backups of all the data into S3 [Amazon’s Simple Storage Service],” he said. “All the logical errors, operator errors, or software bugs, many kinds of corruptions—we would be able to deal with them just from those S3 backups.”

    What if all of Netflix’s systems in Amazon went down? Netflix keeps backups of everything in Google Cloud Storage in case of a natural disaster, a self-inflicted failure that somehow takes all of Netflix’s systems down, or a “catastrophic security breach that might affect our entire AWS deployment,” Izrailevsky said. “We’ve never seen a situation like this and we hope we never will.”

    But Netflix would be ready in part thanks to a system it calls “Armageddon Monkey,” which simulates failure of all of Netflix’s systems on Amazon. It could take hours or even a few days to recover from an Amazon-wide failure, but Netflix says it can do it. Netflix pointed out that Amazon isolates its regions from each other, making it difficult for all of them to go out simultaneously.

    “So that’s not the scenario we’re planning for. Rather it’s a catastrophic bug or data corruption that would cause us to wipe the slate clean and start fresh from the latest good back-up,” a Netflix spokesperson said. “We hope we will never need to rely on Armageddon Monkey in real life, but going through the drill helps us ensure we back up all of our production data, manage dependencies properly, and have a clean, modular architecture; all this puts us in a better position to deal with smaller outages as well.”

    Netflix declined to say where it would operate its systems during an emergency that forced it to move off Amazon. “From a security perspective, it’d be better not to say,” a spokesperson said.

    Netflix has released a lot of its software as open source, saying it prefers to collaborate with other companies than keep secret the methods for making cloud networks more reliable. “While of course cloud is important for us, we’re not very protective of the technology and the best practices, we really hope to build the community,” Izrailevsky said.

  • British and Canadian Governments Accidentally Exposed Passwords and Security Plans to the Entire Internet
    https://theintercept.com/2018/08/16/trello-board-uk-canada

    By misconfiguring pages on Trello, a popular project management website, the governments of the United Kingdom and Canada exposed to the entire internet details of software bugs and security plans, as well as passwords for servers, official internet domains, conference calls, and an event-planning system. The U.K. government also exposed a small quantity of code for running a government website, as well as a limited number of emails. All told, between the two governments, a total of 50 (...)

    #hacking

  • THE IMPACT OF IDEOLOGY ON EFFECTIVENESS IN OPEN SOURCE
    http://flosshub.org/sites/flosshub.org/files/stewartgosain2.pdf

    Tenets of Open Source Ideology
    OSS Norms
    OSS Beliefs
    OSS Values

    We consider two aspects of OSS project effectiveness: the extent to which a project attracts input from the development community and the extent to which it produces observable outputs such as the addition of new features to the software or the fixing of software bugs. While commercial projects have employees paid and directed though formalized mechanisms, a critical step in becoming effective in an OSS project is to attract developers and motivate their input to the project (Mockus et al. 2002; Sturmer 2005; von Krogh et al. 2003). Without people donating their efforts voluntarily, an OSS project has little chance of success, thus the amount of input to a project (i.e., how many people devote how much effort) is an important aspect of effectiveness.

    The number of developers that have been attracted and retained to work on the team (team size) and an estimate of the amount of effort those developers have devoted to the team are the two constructs related to an OSS team’s input effectiveness used in this study.