technology:k-means clustering algorithm

  • KMeans - Cluster 3.0 for Windows, Mac OS X, Linux, Unix
    http://bonsai.hgc.jp/~mdehoon/software/cluster/manual/KMeans.html

    “Since the initial cluster assignment is random, different runs of the k-means clustering algorithm may not give the same final clustering solution. To deal with this, the k-means clustering algorithms is repeated many times, each time starting from a different initial clustering. The sum of distances within the clusters is used to compare different clustering solutions. The clustering solution with the smallest sum of within-cluster distances is saved.

    It should be noted that generally, the k-means clustering algorithm finds a clustering solution with a smaller within-cluster sum of distances than the hierarchical clustering techniques.

    The parameters that control k-means clustering are the number of clusters (k) and the number of trials.”

    Un peu de théorie sur l’algo de clustering k-means.

    #cluster #k-means