0

I guess this is a duplicate of Find element's index in pandas Series .

This is my dataframe;

      WORD1    CAT1   
    elephant   animal  
        lion   animal
       tiger   animal
      hoopoe    bird 
    hornbill    bird
   sunflower   flower
        rose   flower
     giraffe   animal
       zebra   animal
     sparrow    bird  
        duck   animal  

I would like to get the index of each element from 'CAT1';

Let me put it this way;

for d in data['CAT1']:
    print data[data['CAT1'] == d].index[0]
...
0
0
0
3
3
5
5
0
0
3
0

The above returns the index, but falters when there are duplicates. How do I get this rectified?

Community
  • 1
  • 1
richie
  • 17,568
  • 19
  • 51
  • 70
  • For future readers of this question, could you update to be clearer about what you actually *want* as an output? "get the index of each element from 'CAT1'" is ambiguous. Do you want the *first* index of each distinct entry in `CAT1` or do you want to assign each distinct entry a number and replace the text with this number? – LondonRob Feb 12 '14 at 14:40

1 Answers1

1

You can enumerate in Python to get the indices along with the items:

for i, d in enumerate(data['CAT1']):
     print(i)

If you want to select from WORD1 by CAT1, you could zip them, for example:

birds = [w for w, c in zip(data['WORD1'], data['CAT1']) if c == "bird")]

Note: str.index is a method for finding the index of a sub-string within a string.

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
  • As you've seen, `list.index` gives you the *first index only*. It's not entirely clear what you're trying to achieve; have you tried the suggestions in my answer? – jonrsharpe Feb 12 '14 at 11:34
  • @jonrharpe yes. Tried it. Makes sense. But I'm looking for something like this http://stackoverflow.com/q/18327624/1948860 – richie Feb 12 '14 at 11:44
  • The answers there cover this too, you can use `data['CAT1'].get_loc(d)` or `data[data['CAT1'] == d]` – jonrsharpe Feb 12 '14 at 11:51