My input data frame contains more than 100 columns and rows. I want to combine the columns if their header is the same.
Following is my input data frame
Case.ID HRAS TP53 MAP3K1 MAP3K1 TP53
TCGA_1 MSE; MSE;
TCGA_2 MUT;
TCGA_3
TCGA_4 MUT; AMP;
TCGA_5 MSE;
TCGA_6
TCGA_7 MUT;
TCGA_8 MUT; AMP;
TCGA_9 MUT;
TCGA_10
TCGA_11 FRM; st_gai;
TCGA_12 HDEL;
Expected output
Case.ID HRAS TP53 MAP3K1
TCGA_1 MSE;
TCGA_2 MUT;
TCGA_3
TCGA_4 MUT;AMP;
TCGA_5 MSE;
TCGA_6
TCGA_7 MUT;
TCGA_8 MUT; AMP;
TCGA_9 MUT;
TCGA_10
TCGA_11 FRM;st_gai;
TCGA_12 HDEL;
In the expected output, you can see I have combined the same header columns in such a way that if they have the same entry present in the row then it will print only once and if different entries are present in a row then all entries will be taken together. Here they just combine the selected column Combine two or more columns in a dataframe into a new column with a new name