element wise search on pandas column that has list string data

Question

I have a dataframe where one of the columns has values such as below :

colA colB LISTCOL
USA  100   ['ABCD (Actor)', 'XYZ (Actor, Director)', 'PQR (Producer, Writer)']
UK   1200  ['45q34y(Actor,Director, Producer)', '123 (Actor, Director)']

I want to fetch out the elements of the list on each row in the LISTCOL column such that only the element that has Actor in it gets filtered.

I tried

df['ACTOR'] = df.apply(
        lambda elem: [elem for elem in df['LISTCOL'].str if "Actor" in elem],
    axis=1)

However it is not working. Unfortunately, my pandas is 0.23.4 and hence the df.explode() is not applicable for me in this case. Can you please assist how I can get the output i desire:

OUTPUT:

colA colB  ACTOR

USA  100   ['ABCD', 'XYZ']

UK   1200  ['45q34y', '123']

You are many versions behind the current version of pandas. Please upgrade, or install Anaconda distribution. — Trenton McKinney, Sep 28 '20 at 17:00
Does this answer your question? [Pandas expand rows from list data available in column](https://stackoverflow.com/questions/39011511/pandas-expand-rows-from-list-data-available-in-column) — Trenton McKinney, Sep 28 '20 at 17:02
The duplicate has solutions for expanding lists without explode. — Trenton McKinney, Sep 28 '20 at 17:02
What is the output of `print(type(df.iloc[0, 2]))`. If the result is `str`, the you must use `df.LISTCOL = df.LISTCOL.apply(ast.literal_eval)` — Trenton McKinney, Sep 28 '20 at 17:05

Scott Boston · Answer 1 · 2020-09-29T01:41:30.057

0

Try this:

import re

df['Actors'] = [[re.match('(\w+)\s?\(.*?Actor', x).group(1) for x in i if re.match('(\w+)\s?\(.*?Actor', x)] for i in df['LISTCOL']]

Output:

  colA  colB                                            LISTCOL         Actors
0  USA   100  [ABCD (Actor), XYZ (Actor, Director), PQR (Pro...    [ABCD, XYZ]
1   UK  1200  [45q34y(Actor,Director, Producer), 123 (Actor,...  [45q34y, 123

edited Sep 29 '20 at 01:41

answered Sep 28 '20 at 16:32

Scott Boston

147,308
15
139
187

As mentioned in the question explode can’t be used – asimo Sep 28 '20 at 16:42

score 0 · Answer 2 · answered Sep 28 '20 at 17:01

I was considering about using (pd.Series).map()

def make_actors_column(ser):
    temp_list = ''.join(ser).split('(')
    actor_list = []
    for i,string in enumerate(temp_list):
        if 'Actor' in string:
            name_of_actor = temp_list[i-1].split(')')[-1]
            actor_list.append(name_of_actor.strip())
    return actor_list


make_actors_column(df.loc[0,'LISTCOL'])
-->['ABCD', 'XYZ']

df['ACTOR'] = df['LISTCOL'].map(make_actors_column)
df

    colA colB       LISTCOL                                     ACTOR
0   USA 100 [ABCD (Actor), XYZ (Actor, Director), PQR (Pro...   [ABCD, XYZ]
1   UK  120 [45q34y(Actor,Director, Producer), 123 (Actor,...   [45q34y, 123]

I think this function is enough to apply your example

element wise search on pandas column that has list string data

2 Answers2