3

I have a database of 10 million products (title, description, brand, category) as a learning dataset. I want to make an algorithm to classify around 10 000 products which do not have a category.

I made a little java program to train the algorithm using the Naive Bayes classifier, but when I input my 10 000 products, I only get 30% of correct answers....

Is there a way to improve this ?

Thank you.

Simo L.
  • 321
  • 1
  • 3
  • 20
  • NB is only as good as your feature vector and the size of the training set. – matcheek Jun 08 '15 at 12:57
  • A Naive Bayes classifier is mathematically speaking indeed pretty naive and rather inflexible. If you want higher accuracy and flexibility you might want to use a decision tree. See this post for instance: https://stackoverflow.com/questions/10317885/decision-tree-vs-naive-bayes-classifier – JSN Aug 01 '17 at 19:21

0 Answers0