Seenthis
•
 
Identifiants personnels
  • [mot de passe oublié ?]

 
  • #h
  • #ha
RSS: #hadoop

#hadoop

  • #hadoopdb
  • Fıl ☼ @fil 7/06/2011 17:26
    1
    @baroug
    1

    The #Hadoop Distributed File System
    http://www.aosabook.org/en/hdfs.html

    chapitre du #livre « The Architecture of Open Source Applications » consacré au #filesystem distribué Hadoop ; je trouve intéressante la partie sur la durabilité des données :

    Replication of data three times is a robust guard against loss of data due to uncorrelated node failures. It is unlikely Yahoo! has ever lost a block in this way; for a large cluster, the probability of losing a block during one year is less than 0.005. The key understanding is that about 0.8 percent of nodes fail each month. (...) The probability of several nodes failing within two minutes such that all replicas of some block are lost is indeed small.

    Correlated failure of nodes is a different threat. The most commonly observed fault in this regard is the failure of a rack or core switch. (...) If the loss of power spans racks, it is likely that some blocks will become unavailable. But restoring power may not be a remedy because one-half to one percent of the nodes will not survive a full power-on restart. Statistically, and in practice, a large cluster will lose a handful of blocks during a power-on restart.

    In addition to total failures of nodes, stored data can be corrupted or lost. The block scanner scans all blocks in a large cluster each fortnight and finds about 20 bad replicas in the process. Bad replicas are replaced as they are discovered.

    http://www.aosabook.org/images/cover.jpg

    • #HDFS
    • #guard
    • #Yahoo !
    • #UNIX
    • #RAM
    • #User applications
    Fıl ☼ @fil
    Écrire un commentaire

  • Stéphane Bortzmeyer @stephane CC BY-SA 18/05/2011 09:58
    1
    @fil
    1

    Vous êtes plutôt #SQL ou bien vous êtes plutôt #MapReduce pour l’analyse de vos grosses quantités de données ? Ne pleurez pas devant la difficulté du choix, vous pouvez combiner les deux, dit le projet #HadoopDB :

    http://db.cs.yale.edu/hadoopdb/hadoopdb.html

    L’article original :

    http://www.vldb.org/pvldb/2/vldb09-861.pdf

    • #Yale University
    • #Yale University
    Stéphane Bortzmeyer @stephane CC BY-SA
    Écrire un commentaire

  • ןıɟ @fil 5/01/2011 18:41

    Data-Intensive Text Processing with MapReduce
    http://www.umiacs.umd.edu/%7Ejimmylin/MapReduce-book-final.pdf

    #mapreduce #hadoop #book #nosql

    ןıɟ @fil
    • ןıɟ @fil 5/01/2011 18:44

      #seenthis_bug : j’ai été obligé de remplacer dans l’URL le tilde par un %7E, sinon ce n’était pas pris en compte

      ןıɟ @fil
    • Seenthis @seenthis CC BY-NC 5/01/2011 20:57

      OK, corrigé. #seenthis_done

      Seenthis @seenthis CC BY-NC
    Écrire un commentaire

  • ןıɟ @fil 19/06/2009 09:38

    Appscale - implémentation libre du Gogle App Engine
    http://code.google.com/p/appscale
    fonctionne sur les clouds Amazon mais aussi en local
    #amazon #ec2 #aws #xen #cloud #hadoop #python #google

    ןıɟ @fil
    Écrire un commentaire

  • ןıɟ @fil 14/11/2007 21:04

    Running Hadoop MapReduce on Amazon EC2 and Amazon S3
    http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873&categoryID=55
    #s3 #ec2 #hadoop #MapReduce

    ןıɟ @fil
    Écrire un commentaire

Thèmes liés

  • #ec2
  • #mapreduce