I have what I think a peculiar problem, I am trying to get attributes of products that may overlap.
In my case, given the title, manufacturer, description, I need to know whether the product is a Jeans or something else and further more, whether it’s a or Skinny Jeans or other types of Jeans. Going through the sci-kit exercises it seems I can only predict one category at a time, which doesn’t apply to my case, any suggestion on how to tackle the problem?
What I have in mind right now is to have a training data for each category ex:
Jeans = ['desc of jeans 1', 'desc of jeans 2']
Skinny Jeans ['desc of skinny jeans 1', 'desc of skinny jeans 2']
with this training data, I would then ask the probability of a given unknown product and expect this kind of answer in return in percentage of matching:
Unknown_Product_1 = {
'jeans': 93,
'skinny_jeans': 80,
't-shirt': 5
}
Am I way off base? If this is a correct path to take, if so, how do I achieve it?
Thank you!