Drop rows that contains the same value in pandas DataFrame

Question

I'm currently working on a data frame like the one below:

artist	week1	week2	week3	week4
Drake	2	2	3	1
Muse	NA	NA	NA	NA
Bruno Mars	3	3	4	2
Imagine Dragons	NA	NA	NA	NA
Justin Timberlake	2	2	NA	1

What I want to do is to drop the rows that only contain "NA" values. The result should be something like this:

artist	week1	week2	week3	week4
Drake	2	2	3	1
Bruno Mars	3	3	4	2
Justin Timberlake	2	2	NA	1

I've tried using the pandas drop() function but drops every row with at least one "NA" value. In that case, the row for Justin Timberlake would be dropped but that's not what I need.

Jamiu S. · Accepted Answer · 2023-02-03T22:46:40.473

0

Use df.dropna() and set how='all' meaning If all values are NA, drop that row or column. then set the subset columns.

df = df.dropna(how='all', subset=['week1', 'week2', 'week3', 'week4'])
print(df)

Or Keep only the rows with at least 2 non-NA values.

df = df.dropna(thresh=2)
print(df)


              artist  week1  week2  week3  week4
0              Drake    2.0    2.0    3.0    1.0
2         Bruno Mars    3.0    3.0    4.0    2.0
4  Justin Timberlake    2.0    2.0    NaN    1.0

edited Feb 03 '23 at 22:46

answered Feb 03 '23 at 22:16

Jamiu S.

5,257
5
12
34

1

Thanks! I have another question though. Is there another way to set the subset when the number of columns is too large to write the name of each one? For example, if I have 300 weeks columns. – KurtosisCobain Feb 03 '23 at 22:40

Drop rows that contains the same value in pandas DataFrame

1 Answers1