I have two dataframe:
df1:
A B
0 12 some text
1 28 some text_1
2 28 some text_2
3 40 some text
4 50 some text
df2:
C D
0 11 some text_1
1 11 some text_2
2 12 some text
3 28 some text
4 40 some text
5 60 some text
This two dataframes are ordered by column A
and C
(are float) respectively.
And I want:
df3
A B C D
0 NaN 11 some text_1
1 NaN 11 some text_1
2 12 some text 12 some text
3 28 some text_1 28 some text
4 28 some text_2 Nan
5 40 some text 40 some text
6 50 some text Nan
7 NaN 60 some text
So I want to concatenate dataframes by row with the same column value.
EDIT 1
By using df3 = df1.merge(df2, left_on=['A'], right_on=['C'], how='outer')
the result is the cartesian product of all the same rows with A
and C
values. This create a bigger dataframe in case, for example if I have in df1 n
times value x
and in df2 m
time the same x
, the result would be m*n (m > n)
rows with x value
.
But I want only to concat the first n
row of m from
df1
with the n
rows of df2
and leave the remaining m
-n
rows of df1
with a NaN value