I have a dataframe that preexists in this structure:
import pandas as pd
d={'colA':['1','2','3','3','3'],'colB':['NaN','4','2','this','that']}
mydata=pd.DataFrame(data=d)
ColA is integers saved as strings ColB are all strings but contain a mix of integers, NaN and real strings.
I want to create a new column (colC) that checks if the integers in colB are greater than the integers in colA. But I can't figure out how to deal with the strings and NaNs.
The final dataframe should look like this:
d={'colA':[1,2,3,3,3],'colB':['NaN',4,2,'this','that'],'colC':['NaN','Yes','No','NaN','NaN']}
mydata_new=pd.DataFrame(data=d)
Thanks