How to drop all values that are 0 in a single column pandas dataframe?

Question

I have a pandas DataFrame that was created from some raw data, there are hundreds of lines so I will just show the first 10 rows.

I want to get rid of all the zeros in my data frame and replace the column name from 'text' to 'Results', so the final data should look like this:

      Results

 0    26.529
 1    25.558

My method was to use the df.drop() method to drop all rows containing zeros. My code looks like this:

 df = df.drop(df[df['text'] == 0].index,inplace=True)

 # I didn't write the code to replace to column name yet

Somehow when I run this, the resulting df is empty/ nonetype. I have no idea why the drop method just dropped everything in my dataframe. Please help! Much appreciated in advance!

When I debug the code in debug mode (vs code), I see the values in my df are as follows:

 I noticed that every element in my df is an object type. I want to get rid of all the arrays with an empty object. Ex. "000:array([''],dtype=object)"


 [1]: https://i.stack.imgur.com/yk63P.png

Btw, the index values shown on the left are not actual values in the data frame. — CYU1, Feb 03 '22 at 20:41
You don't need to `df.drop()`, just use boolean masking itself: `df = df[df["text"] != 0]` — ddejohn, Feb 03 '22 at 20:47
Hi, I added a screenshot of the variable in my df above. You can copy and paste the image link into your browser to see what I got. Thanks. — CYU1, Feb 17 '22 at 16:28
The reason you're getting an empty dataframe is because you're assigning an in-place operation. You can only *either* use `inplace=True` OR reassign the result via `df = ...`. You cannot do both, because when `inplace=True`, the operation modifies the original data and returns `None` (think of trying to do `my_list = my_list.append(3)`), which you are then assigning to `df`. — ddejohn, Feb 17 '22 at 17:01
https://stackoverflow.com/questions/43893457/understanding-inplace-true-in-pandas — ddejohn, Feb 17 '22 at 17:04

BoomBoxBoy · Accepted Answer · 2022-02-17T16:29:05.010

1

You can do that with the following

df[df["text"].str.strip()!="0"].rename(columns={'text':'Results'}).reset_index(drop=True)

edited Feb 17 '22 at 16:29

answered Feb 03 '22 at 20:49

BoomBoxBoy

1,770
1
5
23

Hi, this doesn't work for me. I tried and the results still contain zeros. – CYU1 Feb 17 '22 at 16:09
What is the output when you run `df["text"].dtype`. I suspect there could be spaces around the 0's? – BoomBoxBoy Feb 17 '22 at 16:13
They are objects. For example, when I ran in debugger mode with setting a breakpoint after the df was created. I can see in the df variables contains something like this "000:array([''],dtype=object". – CYU1 Feb 17 '22 at 16:25
1

Hi, I added a screenshot of the variable in my df above. You can copy and paste the image link into your browser to see what I got. Thanks. – CYU1 Feb 17 '22 at 16:28
I edited my answer to strip spaces, let me know how it goes – BoomBoxBoy Feb 17 '22 at 16:30
Hi, I just tried. Unfortunately, the results are still the same. Zeros are still in the df. – CYU1 Feb 17 '22 at 16:34
Im not sure what else it could be.. Is this a public dataset I can access? If not, I dont have any more ideas :( – BoomBoxBoy Feb 17 '22 at 16:40
This is not a public dataset. But I added a link to the snapshot of the variables in my df above. Did you get a chance to see it? It might help. – CYU1 Feb 17 '22 at 16:49
Yes I did. It is tough to get a sense of what is wrong from that image. Are the rows with empty strings those that have 0 in them? – BoomBoxBoy Feb 17 '22 at 17:26
Yes, those rows turn out to be zeros. Could it be that they are object datatypes so they can't be dropped or stripped? Or maybe my df wasn't properly created. – CYU1 Feb 17 '22 at 17:40
Could be.. Maybe try filtering based on empty strings as well. Something like `df[df["text"]!=""]` – BoomBoxBoy Feb 17 '22 at 17:45
Hi, here's a link of the context of this question: https://stackoverflow.com/questions/71162678/how-to-properly-extract-information-from-bounded-regions-from-images-using-openc – CYU1 Feb 17 '22 at 17:45
1

Hi, df[df["text"]!=""] didn't work either. What a stubborn df that I have created lol! – CYU1 Feb 17 '22 at 17:49
Its never easy lol. Good luck my freind! – BoomBoxBoy Feb 17 '22 at 17:50
1

Thanks, I will update my post if I find a solution. – CYU1 Feb 17 '22 at 17:51

score 0 · Answer 2 · answered Feb 17 '22 at 18:57

0

I found a solution for this problem:

First I converted the data type from object to float64 in my df:

    df['text'] = pd.to_numeric(df['text'])

Then I proceeded to drop the 'nan' values from the df using:

    df = df.dropna()

This works for me!

answered Feb 17 '22 at 18:57

CYU1

41
5

How to drop all values that are 0 in a single column pandas dataframe?

2 Answers2