4

I want to apply a function to all pairs of items in the same group e.g.

Example input:

Group  Item   Value  
A      1       89   
A      2       76  
A      3       2  
B      4       21  
B      5       10  

The desired output is a vector of the function output for all items in the same group.

e.g. for arguments sake if the function was:

addnums=function(x,y){  
  x+y  
}

Then the desired output would be:

165, 91, 78, 31

I have tried to do this using summarize in the dplyr package but this can only be used if the output is a single value.

Sotos
  • 51,121
  • 6
  • 32
  • 66
Helen
  • 75
  • 5

2 Answers2

4

We can split Value for each Group and then use combn to calculate sum for each pair.

sapply(split(df$Value, df$Group), combn, 2, sum)

#$A
#[1] 165  91  78

#$B
#[1] 31

If needed as one vector we can use unlist.

unlist(sapply(split(df$Value, df$Group), combn, 2, sum), use.names = FALSE)
#[1] 165  91  78  31

If you are interested in tidyverse solution using the same logic we can do

library(dplyr)
library(purrr)

df %>%
  group_split(Group) %>%
  map(~combn(.x %>% pull(Value), 2, sum)) %>% flatten_dbl

#[1] 165  91  78  31
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • I tried to do this by myself, however it doesn't work properly. Why `group_by()` doesn't work in this case? `data %>% group_by(Group) %>% {apply(combn(Value, m=2), 2, sum)}` – Adamm May 30 '19 at 13:35
  • @Adamm hmmm...I am not sure using `apply` would be the right choice here. – Ronak Shah May 30 '19 at 14:02
  • Thanks, that is great! However, I can't quite figure out how to modify it to use with my custom function addnums rather than the built in function sum. – Helen May 30 '19 at 14:20
  • 1
    @Helen you could use it like `sapply(split(df$Value, df$Group), function(x) combn(x, 2, function(y) addnums(y[1], y[2])))` – Ronak Shah May 30 '19 at 14:25
2

We can use a group by option with data.table

library(data.table)
setDT(df1)[, combn(Value, 2, FUN = sum), Group]
#   Group  V1
#1:     A 165
#2:     A  91
#3:     A  78
#4:     B  31

If we want to use addnums from the OP's post

setDT(df1)[, combn(Value, 2, FUN = function(x) addnums(x[1], x[2])), Group]
 #  Group  V1
#1:     A 165
#2:     A  91
#3:     A  78
#4:     B  31

Or using tidyverse

library(dplyr)
library(tidyr)
df1 %>% 
  group_by(Group) %>%
  summarise(Sum = list(combn(Value, 2, FUN = sum)))  %>% 
  unnest
# A tibble: 4 x 2
#  Group   Sum
#  <chr> <int>
#1 A       165
#2 A        91
#3 A        78
#4 B        31

Using addnums

df1 %>% 
 group_by(Group) %>%
 summarise(Sum = list(combn(Value, 2, FUN = 
         function(x) addnums(x[1], x[2])))) %>% 
 unnest

Or using base R with aggregate

aggregate(Value ~ Group, df1, FUN = function(x) combn(x, 2, FUN = sum))

data

df1 <- structure(list(Group = c("A", "A", "A", "B", "B"), Item = 1:5, 
    Value = c(89L, 76L, 2L, 21L, 10L)), class = "data.frame", row.names = c(NA, 
-5L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thanks, this works well but it's quite important for me to use the addnums function that I defined rather than sum, is there a way of achieving this? – Helen May 30 '19 at 14:27
  • @Helen Thanks, I updated the post. I would keep the group info column to identify the values correctly – akrun May 30 '19 at 14:30