I am trying to create dummy variables based on integer comparisons in series where Nan is common. A > comparison raises errors if there are any Nan values, but I want the comparison to return a Nan. I understand that I could use fillna() to replace Nan with a value that I know will be false, but I would hope there is a more elegant way to do this. I would need to change the value in fillna() if I used less than, or used a variable that could be positive or negative, and that is one more opportunity to create errors. Is there any way to make 30 < Nan = Nan?
To be clear, I want this:
df['var_dummy'] = df[df['var'] >= 30].astype('int')
to return a null if var is null, 1 if it is 30+, and 0 otherwise. Currently I get ValueError: cannot reindex from a duplicate axis.