1

I have a dataframe which contains two columns (as shown in the figure). enter image description here

I am trying to drop Nan from one of the columns (i.e. FACILITYID). I tried to use the following commands to drop the NaN

temp = temp[temp['FACILITYID'].notnull()]
temp = temp[temp['FACILITYID'].notna()]
temp = temp[~temp['FACILITYID'].isnull()]
temp = temp[~temp['FACILITYID'].isna()]
temp = temp[temp['FACILITYID']!='']

However, none of them remove NaN. I followed the instruction of the existing thread (Nan does not drop out in Python) but no luck. Could anyone point out where am I making the mistake?

user2293224
  • 2,128
  • 5
  • 28
  • 52

1 Answers1

2

Most likely the elements printed as NaN contain just a string composed of these 3 letters. Maybe all other values in this column are also strings (not numbers).

If it was "real" NaN then the column would have been coerced to float (because NaN is a special case of float) and all numeric values would have been terminated with ".0".

To verify the type of each column run:

temp.info()

The printout will contain a row concerning each row (name, the number of non-null values and the type). Caution for string column the type is printed as object.

Valdi_Bo
  • 30,023
  • 4
  • 23
  • 41
  • Thanks. I ran the temp.info() and both columns are object data type. Shall I just use astype() function to convert into float? – user2293224 Nov 04 '19 at 20:37
  • Run *temp.FACILITYID = pd.to_numeric(temp.FACILITYID, errors='coerce')*. *temp.FACILITYID.astype('float')* also does the job. – Valdi_Bo Nov 04 '19 at 20:56
  • I ran the command but it throws an error message "AttributeError: 'Series' object has no attribute 'temp'. – user2293224 Nov 04 '19 at 21:15
  • From your picture I see that *temp* is a **DataFrame**, not a **Series**. Run *type(temp)* to check it. – Valdi_Bo Nov 04 '19 at 21:25
  • it is data frame. when I run the type command it gives "" – user2293224 Nov 04 '19 at 21:27
  • 1
    Maybe you have some older version of *Pandas*, not allowing *temp.FACILITYID* redaction. Change to *temp['FACILITYID']* (in each case). Another hint: Upgrade your *Pandas* installation, then maybe the original redaction will work. – Valdi_Bo Nov 04 '19 at 21:28