What is the difference between K-means clustering and vector quantization?

Question

They seem to be very similar.

I'm dealing with Hidden Markov Models and I need to extract symbols from feature vectors.

In order to extract symbols, do I do vector quantization or k-means clustering?

score 18 · Accepted Answer · answered Jan 26 '12 at 02:53

18

The way I understand it, K-means is one type of vector quantization.

answered Jan 26 '12 at 02:53

Rob Neuhaus

9,190
3
28
37

2

Exactly. K-means clustering is one method for performing vector quantization. The centroids found through K-means are (using information theory terminology) the *symbols* or *codewords* for your *codebook*. To decode a vector, assign the vector to the centroid (or codeword) to which it is closest. – Steve Tjoa Jan 26 '12 at 19:37
Maybe you should explain why k-means is a type of vector quantization by explaining what vector quantization is. – nbro Jun 13 '20 at 17:06

score 5 · Answer 2 · edited Feb 02 '12 at 10:09

The K-means algorithms is the specialization of the celebrated "Lloyd I" quantization algorithm to the case of empirical distributions. (cf. Lloyd)

The Lloyd I algorithm is proved to yield a sequence of quantizers with a decreasing quadratic distortion. However, except in the special case of one-dimensional log-concave distributions, it dos not always converge to a quadratic optimal quantizer. (There are local minimums for the quantization error, especially when dealing with empirical distribution i.e. for the clustering problem.)

A method that converges (always) toward an optimal quantizer is the so-called CLVQ algorithms, which also generalizes to the problem of more general L^p quantization. It is a kind of Stochastic Gradient method. (cf. Pagès)

There are also some approaches based on genetic algorithms. (cf. Hamida et al.), and/or classical optimization procedures for the one dimensional case that converge faster (Pagès, Printems).

What is the difference between K-means clustering and vector quantization?

2 Answers2