**pyCluster is a Python implementation for clustering algorithms, including PAM and Clara. Enjoy!**

**1. PAM**

kMedoids – PAM implementation

See more : http://en.wikipedia.org/wiki/K-medoids

The most common realisation of k-medoid clustering is the Partitioning Around Medoids (PAM) algorithm and is as follows:[2]

1. Initialize: randomly select k of the n data points as the medoids

2. Associate each data point to the closest medoid. (“closest” here is defined using any valid distance metric, most commonly Euclidean distance, Manhattan distance or Minkowski distance)

3. For each medoid m

For each non-medoid data point o

Swap m and o and compute the total cost of the configuration

4. Select the configuration with the lowest cost.

5. repeat steps 2 to 4 until there is no change in the medoid.

**2. Clara**

CLARA implementation

1. For i = 1 to 5, repeat the following steps:

2. Draw a sample of 40 + 2k objects randomly from the

entire data set,2 and call Algorithm PAM to find

k medoids of the sample.

3. For each object Oj in the entire data set, determine

which of the k medoids is the most similar to Oj.

4. Calculate the average dissimilarity of the clustering

obtained in the previous step. If this value is less

than the current minimum, use this value as the

current minimum, and retain the k medoids found in

Step 2 as the best set of medoids obtained so far.

5. Return to Step 1 to start the next iteration.

Project Name: pyCluster

Destination: Python Clustering

Language: Python

IDE: Vim

Library:

Project Web: https://github.com/daveti/pycluster

Git Read Only: https://github.com/daveti/pycluster.git

## About daveti

Interested in kernel hacking, compilers, machine learning and guitars.