Fooling around with OCR.
I have a set of binary images of numbers 0-9 that can be used as training data, and another set of unknown numbers within the same range. I want to be able to classify the numbers in the unknown set by using the k nearest neighbour algorithm.
I've done some studying on the algorithm, and I've read that the best approach is to take quantity characteristics and plot each training data in a feature space with those characteristics as the axes, and for each image in the unknown set do the same, and using the k nearest neighbour algorithm find the closest points, something like what is done here.
What characteristics would be best suited to something like this?