I have a data frame where some columns have the same data, but different column names. I would like to remove duplicated columns, but merge the column names. An example, where test1 and test4 columns are duplicates:
df
test1 test2 test3 test4
1 1 1 0 1
2 2 2 2 2
3 3 4 4 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
and I would like the result to be something like this:
df
test1+test4 test2 test3
1 1 1 0
2 2 2 2
3 3 4 4
4 4 4 4
5 5 5 5
6 6 6 6
Here is the data:
structure(list(test1 = c(1, 2, 3, 4, 5, 6), test2 = c(1, 2, 4,
4, 5, 6), test3 = c(0, 2, 4, 4, 5, 6), test4 = c(1, 2, 3, 4,
5, 6)), .Names = c("test1", "test2", "test3", "test4"), row.names = c(NA,
-6L), class = "data.frame")
Please note that I do not simply want to remove duplicated columns. I also want to merge the column names of the duplicated columns, after the duplicates are removed.
I could do it manually for the simple table I posted, but I want to use this on large datasets, where I don't know in advance what columns are identical. I do not what to remove and rename columns manually, since I might have over 50 duplicated columns.