pandas read_excel function replacing "NA" with blanks?

Question

This is probably a trivial problem but I cannot see how to proceed:

I have a dataframe called data that is taken from an excel file:

data = pd.read_excel("C:\\Users\\firstname.lastname\\Desktop\\variable+".xlsx", sheet_name=variable2, na_values='', usecols="A,B", skiprows=range(2))

This works fine to load the file however in column B there is a string that is precisely equal to "NA", in the resulting dataframe this doesn't show as NA but just appears blank.

I am assuming that pandas is interpreting this NA as a blank and as such leaving it empty.

But this is a legitimate NA and I want it to say NA not blank.

I have tried replacing:

na_values='NA'

and also using regex to replace the column values that are blank with "NA" (not the best solution I admit but would work in this case as column B has only one NA that becomes blank.

df.replace(r'^\s+$', np.nan, regex=True)

What am I doing wrong?

na_filter=False worked for me. – user9940344 Aug 20 '19 at 13:29 — user9940344, Aug 20 '19 at 13:29

score 2 · Answer 1 · answered Aug 20 '19 at 12:59

2

According to the pandas documentation, try to add keep_default_na=False to the parameters of pd.read_excel

answered Aug 20 '19 at 12:59

Aryerez

3,417
2
9
17

IT works for me. As keep_default_na=False will not touch the data although the value is "N/A". – Oscar Oct 31 '22 at 11:12

pandas read_excel function replacing "NA" with blanks?

1 Answers1