removing a list element from a dataframe

Question

I want to remove a list string element if it matches the given criteria. The genres column in my dataframe contains a list of all the possible genres, and I want to remove one genre entry from the whole dataframe.

removing = df['genres']
for row in removing:
    for j in range(len(row)):
        print(row[j])
        if row[j] == 'روايات وقصص':
            print('bingo')
            print(row)
            print(row[j])
            print(j)
            print(df['genres'].pop(j))

This code gives me the following error:

   3626     #  InvalidIndexError. Otherwise we fall through and re-raise
   3627     #  the TypeError.
   3628     self._check_indexing_error(key)

This is what i get right now

df['genres'][3] = [روايات وقصص, روايات رومانسية, روايات خيالية]

and this is what I want to achieve

df['genres'][3] =  [ روايات رومانسية, روايات خيالية]

Why don't you update the `df` directly? For example, `df = df[df['Col1'] == 0]` This removes all zeros from `Col1`. — Mohamad Ghaith Alzin, Aug 17 '22 at 07:44

score 0 · Answer 1 · answered Aug 17 '22 at 08:15

0

Code snippet should solve your use case:

df = df[df['genres'] != 'روايات وقصص']

answered Aug 17 '22 at 08:15

Piyush Sambhi

843
4
13

Galaxy · Answer 2 · 2022-08-18T10:28:22.913

0

Try :

df["genres"].transform(lambda x: "روایات وقصص" in x and x.remove("روایات وقصص"))

this link & this is helpfull

edited Aug 18 '22 at 10:28

answered Aug 17 '22 at 08:17

Galaxy

172
9

this removes every row that has the genre "روایات وقصص" in its list. what i wanted is just remove the element "روایات وقصص" from each row's list. example df['genres'][3] = [روايات وقصص, روايات رومانسية, روايات خيالية] and after removing the element it would be [ روايات رومانسية, روايات خيالية] – lana mora Aug 17 '22 at 15:01
hi @lana-mora, I'm edited my code and I think that this is what you want, pleas test it – Galaxy Aug 18 '22 at 10:28

score 0 · Answer 3 · answered Aug 17 '22 at 08:34

I'd suggest a small workaround:

Eample dataframe:

import pandas as pd

df = pd.DataFrame([['movie_A', 'movie_B', 'movie_C'],
    [['action', 'comedy'], ['thriller', 'action'], ['drama']]]).T
df.columns = ['name', 'genres']

Expand your genres column to multiple columns:

df = pd.concat([df.drop(columns='genres'), pd.DataFrame(df['genres'].tolist(), 
    index=df.index).add_prefix('genre_tmp')], axis=1)

Replace the genre you wish to exclude ('action' in this example, assuming the genre name does not occure in other columns):

df.replace({'action': None}, inplace=True)

Generate a column containing all genres as list.

genres_list = df[df.columns[df.columns.str.contains('genre_tmp')]].values.tolist()
for entry in genres_list:
    if None in entry:
        entry.remove(None)
df['genres'] = genres_list

Finally, remove the 'genres_tmp' columns:

df = df[df.columns[~df.columns.str.contains('genre_tmp')]]

removing a list element from a dataframe

3 Answers3