1

I have a dataframe df1

ID <- c("A","B","C")
Measurement <- c("Length","Height","Breadth")
df1 <- data.frame(ID,Measurement)

I am trying to create combinations of measurements with an underscore between them and put it under the ID column "ALL"

Here is my desired output

   ID           Measurement
    A                Length
    B                Height
    C               Breadth
  ALL Length_Height_Breadth
  ALL Length_Breadth_Height
  ALL Breadth_Height_Length
  ALL Breadth_Length_Height
  ALL Height_Length_Breadth
  ALL Height_Breadth_Length

Also when there are similar measurements in the "measurement" column, I want to eliminate the underscore.

For example:

ID <- c("A","B")
Measurement <- c("Length","Length")
df2 <- data.frame(ID,Measurement)

Then I would want the desired output to be

   ID           Measurement
    A                Length
    B                Length
  ALL                Length

I am trying to do something like this which is totally wrong

df1$ID <- paste(df1$Measurement, df1$Measurement, sep="_")

Can someone point me in the right direction to achieving the above outputs?

I would like to see how it is done programmatically instead of using the actual measurement names. I am intending to apply the logic to a larger dataset that has several measurement names and so a general solution would be much appreciated.

Jaap
  • 81,064
  • 34
  • 182
  • 193
Sharath
  • 2,225
  • 3
  • 24
  • 37

1 Answers1

2

We could use the permn function from the combinat package:

library(combinat)
sol_1 <- sapply(permn(unique(df1$Measurement)), 
                FUN = function(x) paste(x, collapse = '_'))
rbind.data.frame(df1, data.frame('ID' = 'All', 'Measurement' = sol_1))

#    ID           Measurement
# 1   A                Length
# 2   B                Height
# 3   C               Breadth
# 4 All Length_Height_Breadth
# 5 All Length_Breadth_Height
# 6 All Breadth_Length_Height
# 7 All Breadth_Height_Length
# 8 All Height_Breadth_Length
# 9 All Height_Length_Breadth

sol_2 <- sapply(permn(unique(df2$Measurement)), 
                FUN = function(x) paste(x, collapse = '_'))
rbind.data.frame(df2, data.frame('ID' = 'All', 'Measurement' = sol_2))

#    ID Measurement
# 1   A      Length
# 2   B      Length
# 3 All      Length

Giving credit where credit is due: Generating all distinct permutations of a list.

We could also use permutations from the gtools package (HT @joel.wilson):

library(gtools)
unique_meas <- as.character(unique(df1$Measurement))
apply(permutations(length(unique_meas), length(unique_meas), unique_meas),
      1, FUN = function(x) paste(x, collapse = '_'))

# "Breadth_Height_Length" "Breadth_Length_Height" 
# "Height_Breadth_Length" "Height_Length_Breadth"
# "Length_Breadth_Height" "Length_Height_Breadth"
Community
  • 1
  • 1
bouncyball
  • 10,631
  • 19
  • 31