First page Back Continue Last page Image
K-means clustering
- K-means process
- K-means is a non-hierarchical classification that optimizes the allocation of data to a specified number of classes
- It first “seeds” the data space with proposed centroids, assigning each record to the nearest seed
- Once all records are assigned, the centroid for each group is calculated and then used as the seeds for the next round, iteratively until stability
- Two K-means runs on the same data will not necessarily produce the same assignments, but divergences are small
- I've run these in SPSS, R, and PAST
- SPSS produces absurd results
- PAleontological STatistics (PAST) is easy, free, and reliable:
- http://folk.uio.no/ohammer/past
- Hammer, Ø.; Harper, D.A.T.; and Ryan, P. 2001. Paleontological statistics software package for education and data analysis. Palaeontologia Electronica 4, 1: 9 pp.