library(dplyr)
x1 <- tibble(ID = rep(1,4),
Measure = c("Height","Weight","Width","Length"),
AD1= rpois(4,10),
AD2= rpois(4,9))
x2 <- tibble(ID = rep(2,4),
Measure = c("Height","Weight","Width","Length"),
AD1= rpois(4,20),
AD2= rpois(4,19))
xdiff <- x1 %>%
full_join(x2,by="Measure",
suffix=c("_1","_2")) %>%
mutate(diff1 = AD1_1-AD1_2,
diff2 = AD2_1-AD2_2)
ID_1 Measure AD1_1 AD2_1 ID_2 AD1_2 AD2_2 diff1 diff2
<dbl> <chr> <int> <int> <dbl> <int> <int> <int> <int>
1 1 Height 10 12 2 20 15 -10 -3
2 1 Weight 7 10 2 11 16 -4 -6
3 1 Width 12 11 2 22 17 -10 -6
4 1 Length 11 9 2 24 16 -13 -7
xdesire <- xdiff %>%
select(Measure,AD1_1,AD1_2,diff1,AD2_1,AD2_2,diff2)
Measure AD1_1 AD1_2 diff1 AD2_1 AD2_2 diff2
<chr> <int> <int> <int> <int> <int> <int>
1 Height 10 20 -10 12 15 -3
2 Weight 7 11 -4 10 16 -6
3 Width 12 22 -10 11 17 -6
4 Length 11 24 -13 9 16 -7
full_join(), with default arguments, will add suffixes to the duplicate variables in sequential order of x,x,x,y,y,y. I could technically use select() to order the columns as desired, but if the data have large amount of columns, e.g., AD1:AD200, it would be very inefficient.
dply: order columns alphabetically in R shows method of order the columns alphabetically, but in my case, there is "difference" column that is needed between every variables (AD1,diff1,AD2,diff).
I was wondering if full_join() allows duplicate variables to be alternating, and if I could directly add the calculations with them through join() function.
Of course, other methods, such as more advanced select() argument is welcomed too.