2

Let's say I have the following function

get_answer <- function(condition, dp, rp){
  if(condition == "DD"){
    result <- rbinom(n = 2, size = 1, prob = dp)
  }
  
  if(condition %in% c("DR", "RD")){
    result <- c(rbinom(n = 1, size = 1, prob = dp), 
                rbinom(n = 1, size = 1, prob = rp))
  }
  
  if(condition == "RR"){
    result <- rbinom(n = 2, size = 1, prob = rp)
  }
  
  return(result)
}

I create a data.frame like so:

results_df <- data.frame(condition = c(rep("DD", 10000), rep("DR", 10000), rep("RR", 10000)))

I want to be able to get the vector returned from get_answer, for the condition in the column condition, and split the returned values into two columns -- the first value going into column P1 and the second going into column P2.

Something like this:

results_df %>% mutate(p1 = get_answer(condition, .6, .4)[0], p2 = get_answer(condition, .6, .4)[1])

What's the correct way to do this in dplyr?

Parseltongue
  • 11,157
  • 30
  • 95
  • 160

1 Answers1

2

The function is not vectorized. So, we need to apply this on each row with rowwise. Also, the indexing in R starts from 1 and not 0

library(dplyr)
results_df %>%
     rowwise %>%
     mutate(p1 = get_answer(condition, .6, .4)[1], 
             p2 = get_answer(condition, .6, .4)[2])

Instead of invoking the function twice, we can have a list column and then use unnest_wider from tidyr

library(tidyr)
library(stringr)
out <- results_df %>%
    rowwise %>%
    mutate(p1 = list(get_answer(condition, .6, .4) %>%
           as.list)) %>%
    ungroup %>%
    unnest_wider(c(p1)) %>%
    rename_at(-1, ~ str_c('p', seq_along(.)))
    
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Ah, makes sense. Thank you. I know this is outside the bounds of the question, but what would I need to do to make the function 'vectorized?' I'm not sure I've ever fully understood that. – Parseltongue Dec 06 '20 at 18:10
  • And won't this result in a call to `get_answer` twice? Is it possible to just call it once and split the first value into `p1` and the second value into `p2`? – Parseltongue Dec 06 '20 at 18:11
  • 1
    @Parseltongue the `if/else` expects a input of length 1 while `iflese` which is vectorized can have > length 1 – akrun Dec 06 '20 at 18:11
  • 1
    Thanks! It actually seems to run faster just calling the function twice – Parseltongue Dec 06 '20 at 18:20