1

I want to drop those rows in a dataframe that have value '0' in the column 'candidate'. Some of my dataframes only have value '0' in this column. I expected that in this case I will get an empty dataframe, but instead I get the following warning and the unchanged dataframe. How can I get an empty dataframe in this case? Or prevent returning an unchanged dataframe?

Warning message:

C:\Users\User\Anaconda3\lib\site-packages\pandas\core\ops\array_ops.py:253: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison res_values = method(rvalues)

My code:

with open(filename, encoding='utf-8') as file:
    df = pd.read_csv(file, sep=',')
    df.drop(df.index[(df['candidate'] == '0')], inplace=True)
    print(df)

     post id  ...  candidate
0          1  ...          0
1          1  ...          0
2          1  ...          0
3          1  ...          0
4          1  ...          0
..       ...  ...        ...
182       10  ...          0
183       10  ...          0
184       10  ...          0
185       10  ...          0
186       10  ...          0

[187 rows x 4 columns]
user9397006
  • 65
  • 1
  • 7
  • 1
    It's not an error, just a warning. See this post: https://stackoverflow.com/questions/40659212/futurewarning-elementwise-comparison-failed-returning-scalar-but-in-the-futur – vishnufka Jun 12 '20 at 09:19
  • 1
    A more natural way to drop would be `df.loc[df["candidate"]!="0"]`. It would probably avoid the warning as well – AMH Jun 12 '20 at 09:20
  • 1
    @AMH I tried this out, but unfortunately this dosen't change anything, I still get the warning and the same output – user9397006 Jun 12 '20 at 09:31
  • 2
    Maybe check your data types with `df.info()`, when reading the csv `candidate` should either be float or int – AMH Jun 12 '20 at 10:40

2 Answers2

1

Thanks everyone for your suggestions!

Indeed, the value type is int, but only if 0 is the only value in the column. Where other values are present, the type is object.

So I solved the problem by using:

df = df.loc[(df["candidate"] != "0") & (df["candidate"] != 0)]

user9397006
  • 65
  • 1
  • 7
0

Your current implementation tries to find rows from Index matching the condition. You should find the rows to match the condition first and then take its index:

df.drop(df[df['candidate'] == 0].index, inplace=True)

After replacing the line your snippet should return:

Empty DataFrame
Columns: [post id, ..., candidate]
Index: []

You should also check that the type of the column matches with the type of the value you are comparing to.

kampmani
  • 680
  • 5
  • 13