3

I have a csv file(delimiter=,) containing following fields

filename labels
xyz.png  cat
pqz.png  dog
abc.png  mouse           

there is a list containing all the classes

data-classes = ["cat", "dog", "mouse"]

Question : How to replace the string labels in csv with the index of the labels data-classes (i.e. if label == cat then label should change to 0 ) and save it in csv file.

T T
  • 144
  • 1
  • 3
  • 17
  • is this what your looking for? I would anyway advice to use pandas to reada and write the csv http://fastml.com/converting-categorical-data-into-numbers-with-pandas-and-scikit-learn/ – Roelant Jun 12 '17 at 09:36
  • Related and probable dupe: https://stackoverflow.com/questions/31133192/usng-same-label-encoder-to-test-dataset-or-new-label-encoder – EdChum Jun 12 '17 at 09:36
  • LabelEncoder doesn't work – T T Jun 12 '17 at 09:51

1 Answers1

7

Assuming that all classes are present in your list you can do this using apply and call index on the list to return the ordinal position of the class in the list:

In[5]:
df['labels'].apply(data_classes.index)

Out[5]: 
0    0
1    1
2    2
Name: labels, dtype: int64

However, it will be faster to define a dict of your mapping and pass this an use map IMO as this is cython-ised so should be faster:

In[7]:
d = dict(zip(data_classes, range(0,3)))
d

Out[7]: {'cat': 0, 'dog': 1, 'mouse': 2}

In[8]:
df['labels'].map(d, na_action='ignore')

Out[8]: 
0    0
1    1
2    2
Name: labels, dtype: int64

If there are classes not present then NaN is returned

EdChum
  • 376,765
  • 198
  • 813
  • 562
  • for `.apply(data_classes.index)` I get `TypeError: 'RangeIndex' object is not callable` – Rishabh Agrahari Jul 14 '18 at 13:14
  • @RishabhAgrahari this still works for me, so I can't comment unless you post a new question with a full reproducible example – EdChum Jul 17 '18 at 13:41
  • @EdChum data_classes is the same with data-classes from the question? –  Sep 10 '21 at 16:17
  • I am trying to adopt your code into my, but it says 'Int64Index' object is not callable', is there any way to overcome this? –  Sep 10 '21 at 16:18