Hello I would need to merge two dataframe
The first has Two column
df1
Groups Names
G1 SeqA
G1 Seq1(+)
G1 SeqC
G1 Seq2(-)
G1 Seq3(+)
G1 SeqB
G1 SeqG
G2 Seq8(-)
G2 SeqY
G2 SeqH
G3 Seq10(+)
G3 SeqK
G3 Seq12(-)
and another one such as df2:
COL1 COL2 COL3
Seq1(+) SeqA 89
Seq2(-) SeqA 67
Seq3(+) SeqB 89
Seq2(-) SeqC 98
Seq8(-) SeqY 45
Seq8(-) SeqH 90
Seq10(+) SeqK 76
Seq11(-) SeqL 78
Seq12(-) SeqK 89
and the idea is to kind of merge the two dataframe in order to add Groups
information into the df2
.
And the idea would be to first check in df1
for each element in Names
, if this element has a (+)
or a (-)
,then check its corresponding Group
and then go on the df2
and add at the column Group
the corresponding Group
.
Here for instance in df1
line 2
there is the Seq1(+)
, its Group
is G1
, then I look for the line in df2
where Seq1(+)
is present in COL1
and I add G1
to the column Group.
So at the end I should get something like :
Group COL1 COL2 COL3
G1 Seq1(+) SeqA 89
G1 Seq2(-) SeqA 67
G1 Seq3(+) SeqB 89
G1 Seq2(-) SeqC 98
G2 Seq8(-) SeqY 45
G2 Seq8(-) SeqH 90
G3 Seq10(+) SeqK 76
G3 Seq12(-) SeqK 89
As you can see Seq11(-)
is not present in the last df merged because it was not present in the df1
for G3
Does anyone have an idea using python and pandas ?