How do I combine the values in a column with the same column name without having to rename the column names (like what pandas does)

Question

so I have a csv file as below but with 1300 columns

id	Date	A	A	B	B	B	B	C	C	C
A	2009	Y				Y
B	2009	Y		Y	Y	Y
C	2011	Y	Y						Y	Y
D	2013		Y		Y	Y
E	2015						Y	Y		Y
F	2017				Y			Y	Y

I want to combine the column values of all columns that have the same column name to look like below:

id	Date	A	B	C
A	2009	Y	Y
B	2009	Y,Y	Y,Y,Y
C	2011	Y,Y		Y,Y
D	2013	Y	Y,Y
E	2015		Y	Y,Y
F	2017		Y	Y,Y

I've been searching on how to combine columns with same column names in excel, r and python, but most of the samples either involve summing up the numerical values instead of concatenating strings or have duplicate rows instead of columns.

I've tried the following code I saw on Stackoverflow

pd.concat(x for _, x in df.groupby(df.columns.duplicated(), axis=1))

But for some reason, I get this output:

id	Date	A	A.1	B	B.1	B.2	B.3	C	C.1	C.2
A	2009	Y				Y
B	2009	Y		Y	Y	Y
C	2011	Y	Y						Y	Y
D	2013		Y		Y	Y
E	2015						Y	Y		Y
F	2017				Y			Y	Y

I feel like this should be easy, but I can't seem to find a solution around this.

Can you please provide the input data in a form that's easier to create a df from? Setting up the df takes longer than to answer the question. — timgeb, Mar 24 '21 at 11:13
Also, does [this](https://stackoverflow.com/questions/24390645/python-pandas-merge-samed-name-columns-in-a-dataframe) answer your question? — timgeb, Mar 24 '21 at 11:15
@timgeb - not really - because I just realised that pandas changes the column name (hence the A.1, A.2, B.1, B.2) to make it unique - which I then have to revert the column names back to old names (that's 1300 of them). Is there a more efficient way to do this? — cms72, Mar 24 '21 at 13:17
Thanks @Pygirl for making them into tables. I didn't know how to do that. — cms72, Mar 24 '21 at 13:17
Use `df.columns = df.columns.str.split('.').str[0]` first for remove values after `.` — jezrael, Mar 24 '21 at 13:19
@jezrael that worked. I think I was going about it the long way. Thank you both! — cms72, Mar 24 '21 at 13:22
Why is this tagged with R and Excel if it's only about Python? — camille, Mar 24 '21 at 21:02
Hi @camille, because I was also trying to filter the data using R dplyr and excel consolidate. Nothing was working at the time - so I just tagged all 3. — cms72, Mar 25 '21 at 02:34

How do I combine the values in a column with the same column name without having to rename the column names (like what pandas does)

0 Answers0