-2

I have a dataset. It contains , some random "-" values for the tail rows. Due to which , I'm not able to convert the datatypes of the columns. How can I delete such values of "-" using pandas ?

df2 = df2.drop(df2[df2["Inns"] == "-" or df2["NO"] == "-" or df2["Runs"] == "-" or df2["HS"] == "-" or df2["Ave"] == "-" or df2["BF"] == "-" or df2["SR"] == "-" or df2["100"] == "-" or df2["50"] == "-" or df2[0]=="-"].index)

df2

This is what I've used and below is the error I'm getting :

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Please help me remove it.

  • Your question needs a minimal reproducible example consisting of sample input, expected output, actual output, and only the relevant code necessary to reproduce the problem. See [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for best practices related to Pandas questions. – itprorh66 May 15 '23 at 18:29

1 Answers1

0

You can ignore errors when converting datatypes with the pandas.DataFrame.astype method using the errors argument.

import pandas as pd

df = pd.DataFrame([{'col': 22},
                   {'col': '-'}])

df['col'] = df['col'].astype(int, errors='ignore')

If what you want is to convert to float you can use pandas.replace to replace the "-" values with np.nan or math.nan and convert without issue.

import math
import pandas as pd
df = pd.DataFrame([{'col': 22},
                   {'col': '-'}])
df = df.replace('-', math.nan)
df['col'] = df['col'].astype(float)

If what you want is to convert to int the second solution won't work and you have to replace "-" with 0. For more info see this post.

Edit: As explained here you can convert to integers using the built-in pandas integer type.

import math
import pandas as pd

df = pd.DataFrame([{'col': 22},
                   {'col': '-'}])
df = df.replace('-', math.nan)
df['col'] = df['col'].astype('Int64')
PARAK
  • 130
  • 8