Is it possible to pd.merge()
a df
and csv
when the df
column is a list
(could be more than one variable) and the csv
only one.
df
GV2015_VAL polName
0 605000.0 [LENTEGEUR]
1 NaN [DURBANVILLE]
2 NaN [DURBANVILLE]
3 730000.0 [BISHOP LAVIS, GUGULETHU, MANENBERG]
4 625000.0 [LENTEGEUR]
csv
name m p j
0 LENTEGEUR 17.0 501.0 518.0
1 DURBANVILLE 10.0 495.0 505.0
2 BELHAR 9.0 352.0 361.0
3 MANENBERG 29.0 1013.0 1042.0
4 GUGULETHU 1.0 192.0 193.0
5 BISHOP LAVIS 10.0 495.0 505.0
name
will match with polName
.
Furthermore; the j
parameter of csv
should .aggregate
(mean) when more than one variable present. So that the output for df should be:
df
GV2015_VAL polName merge_j
0 605000.0 [LENTEGEUR] 518.0
1 NaN [DURBANVILLE] 505.0
2 NaN [DURBANVILLE] 505.0
3 730000.0 [BISHOP LAVIS, GUGULETHU, MANENBERG] 580.0
4 625000.0 [LENTEGEUR] 518.0
How can the built-in .merge()
handle the challenge or will looping / list comprehension be necessary?