0

I want to predict a text classification which is based on the correlation of the text in the training data set.

For eg. This is my training data: "Mouse M325", "Mouse for xyz M325", "M325 Mouse logitech", "Logitech mouse number M325"

As it is visible that Mouse and M325 definitely have a high correlation when compared with M325 and Logitech or others.

I want to use the correlations to predict a classification for the next dataset. For eg. If the next data is "Mouse used by Alex number is M325"...it should give me "Mouse M325" as text classifier and notify in a separate tab that the model has predicted this description but it was not something that Machine had seen earlier in trained data. Like,Result Model has predicted . How to solve this?

biker007
  • 21
  • 1
  • 3
  • My suggestion is to use sklearn naive bayes. – user1211 Nov 30 '16 at 05:05
  • Thanks for replying, Can you please guide me how to do it? I think it can be done by by k means clustering. Please guide or any link where something similar is done would be really helpful – biker007 Nov 30 '16 at 06:51
  • [This answer](http://stackoverflow.com/a/21298368/2661491) would be a good place to start, then maybe move to something fancier like TF-IDF – evan.oman Nov 30 '16 at 21:35
  • There is a huge lib on scikit (http://scikit-learn.org/stable/modules/naive_bayes.html) – user1211 Dec 01 '16 at 09:27

0 Answers0