1

I have a pandas data frame made from a csv file of Olympic medal winners and their information. Two of the rows in the file, Height and Weight, tend to have NA for values that are unknown. I want to delete rows that contain "NA" as a string in these columns, but it keeps giving me the error as follows:

TypeError: invalid type comparison

The code I'm using to only show rows without NA in Height and Weight is:

features = features[features["Height"] != "NA"]

#and

features = features[features["Weight"] != "NA"]

This column has integers and strings, so maybe I need to convert integers to strings? I also need a solution that only takes out the NA entries in those two rows, because I am purposefully keeping NA entries in the Medal Column, so,

features = features.dropna(subset=['Weight', 'Height'])

won't work after declaring any entry with "NA" a null value.

Thanks for the help!

  • 2
    I think need `features = features.dropna(subset=['Weight', 'Height'])` if `NA` are missing values – jezrael Aug 18 '18 at 05:27
  • When you import via `read_csv` you can specify that `'NA'` are null values by including the parameter `na_values=['NA']` in your `read_csv` call. – piRSquared Aug 18 '18 at 05:30

0 Answers0