Deleting Rows in Dataframe After Exploding in Pandas

Question

I have a dataframe that originally looked like this:

|student_name|subject                    |
|------------|---------------------------|
|smith       |['maths', 'english']       |
|jones       |['maths', 'english']       |
|alan        |['art', 'maths', 'english']|

I used explode to get the following table:

|student_name|subject|
|------------|-------|
|smith       |maths  |
|smith       |english|
|jones       |maths  |
|jones       |english|
|alan        |art    |
|alan        |maths  |
|alan        |english|

I then reset the index as I want to delete all rows containing the string 'maths'. However, instead of just deleting the rows containing maths it deletes all rows as if they hadn't been exploded/reindexed.

Here's my code:

student_df = pd.DataFrame(data)
student_df = student_df.explode('subject')
student_df = student_df.reset_index(drop=True)
student_df = student_df[student_df["subject"].str.contains("maths") == False]

What am I doing wrong?

Just use: `student_df.query('subject != "maths"')` . This is quite intuitive and easy to use. — Mayank Porwal, May 10 '22 at 08:16

score 0 · Answer 1 · answered May 10 '22 at 08:13

0

The ideal way to do this is to avoid multiple assignments and to use a pipeline.

A few remarks:

You can pass a function/lambda to loc to refer to the dataframe itself.
Use ~ to invert the value of str.contains.
if you want to check for exact match, do not use str.contains but eq/ne (equal/not equal).

student_df2 = (student_df
 .explode('subject')
 .loc[lambda d: ~d['subject'].str.contains("maths")]
)

output:

  student_name  subject
0        smith  english
1        jones  english
2         alan      art
2         alan  english

answered May 10 '22 at 08:13

mozway

194,879
13
39
75

1

OP says *However, instead of just deleting the rows containing maths it deletes all rows as if they hadn't been exploded/reindexed.*, your output is exactly the same with what his code produces. – Ynjxsjmh May 10 '22 at 08:19
@Ynjxsjmh I read it the other way around, that OP does **not** want to remove all. OP should provide the full explicit output for clarity – mozway May 10 '22 at 08:25

Deleting Rows in Dataframe After Exploding in Pandas

1 Answers1