I have a dataframe df
df = pd.DataFrame({'A':['-a',1,'a'],
'B':['a',np.nan,'c'],
'ID':[1,2,2],
't':[pd.tslib.Timestamp.now(),pd.tslib.Timestamp.now(),
np.nan]})
Added a new column
df['YearMonth'] = df['t'].map(lambda x: 100*x.year + x.month)
Now I want to write a function or macro which will do date comparasion, create a new dataframe also add a new column to dataframe.
I tried like this but seems I am going wrong:
def test(df,ym):
df_new=df
if(ym <= df['YearMonth']):
df_new+"_"+ym=df_new
return df_new+"_"+ym
df_new+"_"+ym['new_col']=ym
Now when I call test function I want a new dataframe should get created named as df_new_201612
and this new dataframe should have one more column, named as new_col
that has value of ym
for all the rows.
test(df,201612)
The output of new dataframe is:
df_new_201612
A B ID t YearMonth new_col
-a a 1 2016-12-05 12:37:56.374620 201612 201612
1 NaN 2 2016-12-05 12:37:56.374644 201208 201612
a c 2 nat nan 201612