1

I had previously asked this question on resampling and looping using dplyr function. The accepted solution was working just fine earlier, but not instead of giving 8000 values it's producing only one value of mean and variance. My R has also been throwing me error related to the 'stringi' package and it's being stubborn to recognize it even if it is installed. I wonder if the two are related? If it's not related, how can I obtain those 8000 values instead on 1 value for mean and variance?

The code I am currently running is:

  library(dplyr)
  fertilizer <- c("N","N","N","N","N","N","N","N","N","N","N","N","P","P","P","P","P","P","P","P","P","P","P","P","N","N","N","N","N","N","N","N","N","N","N","N","P","P","P","P","P","P","P","P","P","P","P","P")

    crop <- c("alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group","alone","group")

    level <- c("low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","high","low","low","high","low")

    growth <- c(0,0,1,2,90,5,2,5,8,55,1,90,2,4,66,80,1,90,2,33,56,70,99,100,66,80,1,90,2,33,0,0,1,2,90,5,2,2,5,8,55,1,90,2,4,66,0,0)

    dat <- data.frame(fertilizer, crop, level, growth)
    dat %>% 
      group_by(fertilizer, crop, level) %>% 
      sample_n(3*1000, replace = T) %>% 
      mutate(sample_id = rep(1:1000, each = 3)) %>% 
      group_by(sample_id, add = TRUE) %>% 
      summarise(
        mean = mean(growth, na.rm = T),
        var = sd(growth)^2
      ) %>% 
      ungroup()
Rspacer
  • 2,369
  • 1
  • 14
  • 40
  • Not clear about what you expect. If you add `sample_id` as grouping variable along with the other groups, check the count for each of those. `dat %>% + group_by(fertilizer, crop, level) %>% + sample_n(3*1000, replace = T) %>% + mutate(sample_id = rep(1:1000, each = 3)) %>% ungroup %>% count(fertilizer, crop, level, sample_id) # A tibble: 8,000 x 5` that means you would get `8000` values for mean and sd – akrun Sep 15 '19 at 20:10
  • My question is when I run the code above: after the ungroup() why isnt 8000 values being displayed? – Rspacer Sep 15 '19 at 20:20
  • 1
    Okay, I get 8000 values though – akrun Sep 15 '19 at 20:21
  • 1
    `dat %>% + group_by(fertilizer, crop, level) %>% + sample_n(3*1000, replace = T) %>% + mutate(sample_id = rep(1:1000, each = 3)) %>% + group_by(sample_id, add = TRUE) %>% + summarise( + mean = mean(growth, na.rm = T), + var = sd(growth)^2 + ) %>% + ungroup() # A tibble: 8,000 x 6` – akrun Sep 15 '19 at 20:21
  • 1
    It could be that you also loaded `plyr` package along with `dplyr`. Try by adding `dplyr::summarise(` instead of simply `summarise` if the function got masked by the same function from `plyr` – akrun Sep 15 '19 at 20:22
  • YES! That worked – Rspacer Sep 15 '19 at 20:24

1 Answers1

1

It could be an issue of masking the same function from another package. It is commonly found when plyr and dplyrare loaded. E.g. here, we don't have plyr loaded, but can get the same behavior if we explicitly specify the summarise as plyr::summarise

library(dplyr)
dat %>% 
       group_by(fertilizer, crop, level) %>% 
       sample_n(3*1000, replace = T) %>% 
       mutate(sample_id = rep(1:1000, each = 3)) %>% 
       group_by(sample_id, add = TRUE) %>% 
       plyr::summarise(
         mean = mean(growth, na.rm = T),
        var = sd(growth)^2
       ) %>% 
       ungroup()
#      mean      var
#1 30.98258 1390.291

The solution would be

1) Either start on a fresh session with only dplyr loaded

2) Use the same session and specify the package name along with function using :: (dplyr::summarise() instead of simply summarise(

akrun
  • 874,273
  • 37
  • 540
  • 662