I have a database of 10 million products (title, description, brand, category) as a learning dataset. I want to make an algorithm to classify around 10 000 products which do not have a category.
I made a little java program to train the algorithm using the Naive Bayes classifier, but when I input my 10 000 products, I only get 30% of correct answers....
Is there a way to improve this ?
Thank you.