I am trying to clean a dataset in pandas, information is stored ona csv file and is imported using:
tester = pd.read_csv('date.csv')
Every column contains a '?' where the value is missing. For example there is an age column that contains 9 question marks (?)
I am trying to set the all the question marks to NaN, i have tried:
tester = pd.read_csv('date.csv', na_values=["?"])
tester['age'].replace("?", np.NaN)
tester.replace('?', np.NaN)
for col in tester :
print tester[col].value_counts(dropna=False)
Still returns 0 for the age when I know there is 9 (?s). In this case I assume the check is failing as the value is never seen as being ?.
I have looked at the csv file in notepage and there is no space etc around the character.
Is there anyway of forcing this so that it is recognised?