I have the following dataframe:
df=pd.DataFrame({'code1':["A","B","A"],"code2":["k","l","k"],'Names':[['EUGENIO NETO','JUAN MATIAS SERAGOPIAN'],['EUGENIO LUPORINI NETO'],['SIMONE FANKHAUSER','ALEX SOUZA']]})
code1 code2 Names
0 A k [EUGENIO NETO, JUAN MATIAS SERAGOPIAN]
1 B l [EUGENIO LUPORINI NETO]
2 A k [SIMONE FANKHAUSER, ALEX SOUZA]
I want to groupby code1
and code2
and combine the lists in Names
. In a way gets to look like this:
code1 code2 Names
0 A k [EUGENIO NETO, JUAN MATIAS SERAGOPIAN, SIMONE FANKHAUSER, ALEX SOUZA]
1 B l [EUGENIO LUPORINI NETO]
Already checked the following answers:
Groupby and append lists and strings
So I've tried to adapt the answers of those questions to my case (but didn't manage to solve):
df['Names']=df[['code1','code2',"Names"]].groupby(['code1','code2'])["Names"].agg('sum')
----> ValueError: Function does not reduce
df['Names']=df[['code1','code2',"Names"]].groupby(['code1','code2'])["Names"].agg('Names')
----> AttributeError: 'SeriesGroupBy' object has no attribute 'Names'
df['Names']=df[['code1','code2',"Names"]].groupby(['code1','code2'])["Names"].transform(lambda x: append(x))
----> NameError: name 'append' is not defined
Am I missing something or that's a wrong way?
EDIT
Andrej and NYC Coder suggested functional solutions indeed. But when I ran that that in a larger dataset I got the same ValueError: Function does not reduce
. Researched on what could that be and found this question here: Pandas Groupby Agg Function Does Not Reduce
The elected answer suggests to use tuples since lists are problematic. Another answer explains where that happens in the pandas code. Tuples would be the best way? How to apply that here?