Questions tagged [multilabel-classification]

Multi-label classification refers to the problem in Machine Learning of assigning multiple target labels to each sample, where the labels represent a property of the sample point and need not be mutually exclusive.

844 questions
87
votes
2 answers

How does Keras handle multilabel classification?

I am unsure how to interpret the default behavior of Keras in the following situation: My Y (ground truth) was set up using scikit-learn's MultilabelBinarizer(). Therefore, to give a random example, one row of my y column is one-hot encoded as…
user798719
  • 9,619
  • 25
  • 84
  • 123
40
votes
2 answers

Facing ValueError: Target is multiclass but average='binary'

I'm trying to use Naive Bayes algorithm for my dataset. I'm able to find out the accuracy but trying to find out precision and recall for the same. But, it is throwing the following error: ValueError: Target is multiclass but average='binary'.…
39
votes
3 answers

What is the difference between OneVsRestClassifier and MultiOutputClassifier in scikit learn?

Can someone please explain (with example maybe) what is the difference between OneVsRestClassifier and MultiOutputClassifier in scikit-learn? I've read documentation and I've understood that we use: OneVsRestClassifier - when we want to do…
35
votes
2 answers

Multilabel Text Classification using TensorFlow

The text data is organized as vector with 20,000 elements, like [2, 1, 0, 0, 5, ...., 0]. i-th element indicates the frequency of the i-th word in a text. The ground truth label data is also represented as vector with 4,000 elements, like [0, 0,…
Benben
  • 1,355
  • 5
  • 18
  • 31
32
votes
5 answers

Precision/recall for multiclass-multilabel classification

I'm wondering how to calculate precision and recall measures for multiclass multilabel classification, i.e. classification where there are more than two labels, and where each instance can have multiple labels?
31
votes
3 answers

XGBoost for multilabel classification?

Is it possible to use XGBoost for multi-label classification? Now I use OneVsRestClassifier over GradientBoostingClassifier from sklearn. It works, but use only one core from my CPU. In my data I have ~45 features and the task is to predict about 20…
26
votes
5 answers

Which loss function and metrics to use for multi-label classification with very high ratio of negatives to positives?

I am training a multi-label classification model for detecting attributes of clothes. I am using transfer learning in Keras, retraining the last few layers of the vgg-19 model. The total number of attributes is 1000 and about 99% of them are 0s.…
23
votes
3 answers

How to manually specify class labels in keras flow_from_directory?

Problem: I am training a model for multilabel image recognition. My images are therefore associated with multiple y labels. This is conflicting with the convenient keras method "flow_from_directory" of the ImageDataGenerator, where each image is…
23
votes
1 answer

Scikit Learn Multilabel Classification: ValueError: You appear to be using a legacy multi-label data representation

i am trying to use scikit learn 0.17 with anaconda 2.7 for a multilabel classification problem. here is my code import pandas as pd import pickle import re from sklearn.cross_validation import train_test_split from sklearn.metrics.metrics import…
AbtPst
  • 7,778
  • 17
  • 91
  • 172
20
votes
1 answer

Multi-label classification with class weights in Keras

I have a 1000 classes in the network and they have multi-label outputs. For each training example, the number of positive output is same(i.e 10) but they can be assigned to any of the 1000 classes. So 10 classes have output 1 and rest 990 have…
Mahmud Sabbir
  • 371
  • 1
  • 2
  • 12
20
votes
3 answers

caffe with multi-label images

I have a dataset of images that have multiple labels; There are 100 classes in the dataset, and each image has 1 to 5 labels associated with them. I'm following the instruction in the following URL: https://github.com/BVLC/caffe/issues/550 It says…
ytrewq
  • 3,670
  • 9
  • 42
  • 71
19
votes
6 answers

How to get Top 3 or Top N predictions using sklearn's SGDClassifier

from sklearn.feature_extraction.text import TfidfVectorizer import numpy as np from sklearn import linear_model arr=['dogs cats lions','apple pineapple orange','water fire earth air', 'sodium potassium calcium'] vectorizer = TfidfVectorizer() X =…
Pranay Mathur
  • 903
  • 2
  • 9
  • 19
16
votes
4 answers

How to calculate unbalanced weights for BCEWithLogitsLoss in pytorch

I am trying to solve one multilabel problem with 270 labels and i have converted target labels into one hot encoded form. I am using BCEWithLogitsLoss(). Since training data is unbalanced, I am using pos_weight argument but i am bit…
Naresh
  • 1,842
  • 2
  • 24
  • 36
16
votes
2 answers

UserWarning: Label not :NUMBER: is present in all training examples

I am doing multilabel classification, where I try to predict correct labels for each document and here is my code: mlb = MultiLabelBinarizer() X = dataframe['body'].values y = mlb.fit_transform(dataframe['tag'].values) classifier = Pipeline([ …
15
votes
2 answers

Plot Confusion Matrix for multilabel Classifcation Python

I'm looking for someone who can help me to plot my Confusion Matrix. I need this for a term paper at the university. However I have very little experience in programming. In the pictures you can see the classification report and the structure of my…
1
2 3
56 57