2

I have a multi_class problem instead of multi_label problem and I have a dataframe as follow .
Datafrmae

and i want it to use in flow_from_dataframe

train_generator=train_data_gen.flow_from_dataframe(train_df,directory='directory',
                                                      target_size=(img_shape,img_shape),
                                                      x_col="image_id",
                                                      y_col=['healthy','multiple_diseases','rust','scab'],
                                                      class_mode='categorical',
                                                      shuffle=False,
                                                       subset='training',
                                                      batch_size=batch_size)

and i am getting following error

TypeError: If class_mode="categorical", y_col="['healthy', 'multiple_diseases', 'rust', 'scab']" column values must be type string, list or tuple.
Talha Anwar
  • 2,699
  • 4
  • 23
  • 62
  • Possible duplicate of https://stackoverflow.com/questions/38334296/reversing-one-hot-encoding-in-pandas – Rob Mar 13 '20 at 07:26

2 Answers2

3

Use class_mode = "raw" so that all 4 classes are loaded with binary labels.

For information on how to modify labels and various ways to use the class_mode for multi class classification, i recommend this article.

venkata krishnan
  • 1,961
  • 1
  • 13
  • 20
0

Using class_mode='categorical', as far as I know you can only have one column with a given set of classes (let's say 0,1,2,... N)

Now if I have understand the question correctly, you would like to be able to predict the combination of y labels let's say y_col="['healthy', 'multiple_diseases', 'rust', 'scab']".

There are two approaches to solve your problem:

  1. Build a predictive model for each y column. Which means that you will first train a model which can predict "healthy". Then a second model which can predict "multiple diseases"... And so on.
  2. You build a predictive model which can classify all of them at once. For this you need to create a new label column (global label). You could either use a if/elif condition loop or a lambda function. The new column would correspond to y_col=['Global_Label']. For this I would recommend looking into this article: https://machinelearningmastery.com/how-to-prepare-categorical-data-for-deep-learning-in-python/

In particular this

def prepare_targets(y_train, y_test):
le = LabelEncoder()
le.fit(y_train)
y_train_enc = le.transform(y_train)
y_test_enc = le.transform(y_test)
return y_train_enc, y_test_enc
Dr. H. Lecter
  • 478
  • 2
  • 5
  • 16