2

newbie here

I've got a Pandas DataFrame that contains that contains the column crewjob as below. It looks like each row is a list.

    crewjob
    [Director, Screenplay, Screenplay, Screenplay,...
    [Executive Producer, Screenplay, Original Musi...
    [Director, Characters, Writer, Sound Recordist]
    [Director, Screenplay, Producer, Producer, Pro...
    [Original Music Composer, Director of Photogra...
    [Director, Screenplay, Producer, Producer, Ori...
    [Director, Screenplay, Producer, Original Musi...
    [Screenplay, Screenplay, Director, Novel]
    [Director, Screenplay, Screenplay, Producer, P...

I'd like to extract for each row the position of director within the list. I've tried the enumerate method and it chucks out a blank list.

   indices = [i for i, x in enumerate(creditsdf['crewjob']) if x == "Director"]

I've also tried the .index method

   creditsdf['test'] = creditsdf['crewjob'].index("Director")

Which gives the following error;

TypeError: 'RangeIndex' object is not callable
Abhi
  • 4,068
  • 1
  • 16
  • 29
BilzR
  • 79
  • 1
  • 6

1 Answers1

0

Use nested list comprehension:

indices = [[i for i, x in enumerate(y) if x == "Director"] for y in increditsdf['crewjob']]
print (indices)
[[0], [], [0], [0], [], [0], [0], [2], [0]]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks jezrael, the above solution works. Out of curiousity is it possible to do this with the .index method? – BilzR Nov 25 '18 at 14:41
  • @BilzR - Yes, but is necessary use `indices = [find_element_in_list("Director", y) for y in increditsdf['crewjob']]` from [here](https://stackoverflow.com/a/16034499/2901002) for avoid error if not exist value in list. – jezrael Nov 25 '18 at 14:44