I need to merge two dfs which have a lot of missing values (np.nan, None and (null) ).
t1= pd.DataFrame(np.array([[1,2,3],[4,5,99]]),columns=['a','b','c'])
t2= pd.DataFrame(np.array([[1,None,3,'hello'],[4,5,6,'moon']]),columns=['a','b','c','d'])
t = pd.merge(t1,t2,how='outer', on=["a","c"])
That is, the data frames are:
t1 =
a b c
0 1 2 3
1 4 5 99
t2 =
a b c d
0 1 None 3 hello
1 4 5 6 moon
I need a result df that gives me one row per observation, without loosing any data.
Instead, I get a new row keeping the 'None' as a value.
In the example above, I would like
t= pd.DataFrame(np.array([[1,2,3,'hello'],[4,5,99,'moon'],[4,5,6,'moon']]),columns=['a','b','c','d'])
That is, I would like:
t =
a b c d
0 1 2 3 hello
1 4 5 99 moon
2 4 5 6 moon