This is probably a trivial problem but I cannot see how to proceed:
I have a dataframe called data that is taken from an excel file:
data = pd.read_excel("C:\\Users\\firstname.lastname\\Desktop\\variable+".xlsx", sheet_name=variable2, na_values='', usecols="A,B", skiprows=range(2))
This works fine to load the file however in column B there is a string that is precisely equal to "NA", in the resulting dataframe this doesn't show as NA but just appears blank.
I am assuming that pandas is interpreting this NA as a blank and as such leaving it empty.
But this is a legitimate NA and I want it to say NA not blank.
I have tried replacing:
na_values='NA'
and also using regex to replace the column values that are blank with "NA" (not the best solution I admit but would work in this case as column B has only one NA that becomes blank.
df.replace(r'^\s+$', np.nan, regex=True)
What am I doing wrong?