0

I was thinking about create functions based on dplyr package. I have seen few examples, such as janitor package and Organism.dplyr. But, I don't know I can expand or inherent the dplyr features, or even if it is possible.

For instance. What I want:

data %>% group_by(columnX) %>% my_mutate_like_function()

But, It doesn't work, I saw a post about it using do() as an alternative... But, it is not what I want.

Could anyone help me? Thanks.

== Follows a code example (Edited) ==

data <- data.frame(groupname = c('A', 'B', 'A', 'A', 'B', 'B'), 
                   value = c(1, 3, 4, 2, 1.4, 5))

my_mutate_like_function <- function(data) {
  data$category <- ifelse(data$value <= mean(data$value), 'In', 'Out')
  data$meanvalue <- mean(data$value)
  data
}

data_works <- data %>% group_by(groupname) %>% 
  mutate(category = ifelse(value <= mean(value), 'In', 'Out'), meanvalue = mean(value)) 
# That's the right output, each "groupname" had their average calculated and it was used a threshold value

data_fails <- data %>% group_by(groupname) %>% 
  my_mutate_like_function() 
# The group_by properties seems not work inside my function
  • 2
    Can you show the code that doesn't work along with a data sample and the expected output? – eipi10 Aug 03 '17 at 02:41
  • I added a code example. – andrefonseca Aug 03 '17 at 04:08
  • sorry if this is a stupid comment but what about just making use of dplyr inside your function, i.e. a wrapper? or do you want to use the "baseR approach" inside your function? – friep Aug 03 '17 at 07:26
  • If understand your suggestion it's something like just write a "dplyr::mutate" inside my function. Is it? So, actually, I'm coding in this way... However, the problem is, even coding like this the functions still not "extending" the dplyr::group_by behavior. Could you send me a code snippet about your suggestion? – andrefonseca Aug 03 '17 at 22:28

2 Answers2

0

According to this question dplyr::mutate to add multiple values there is no elegant way to get two return values in one function in dplyr. To make use of the group_by I only managed to get it working if I wrap the function in a mutate() which makes sense, as mutate will properly handle the grouping before passing the values to your new function. I added a to make this visible.

print(mean(value))

so a possible solution would be:

my_mutate_like_function1 <- function(value) {

    ifelse(value <= mean(value), 'In', 'Out')
}


my_mutate_like_function2 <- function(value) {
    print(mean(value))
    mean(value)
}


data %>% group_by(groupname) %>% 
    mutate(category=my_mutate_like_function1(value),meanvalue=my_mutate_like_function2(value)) 
Jan
  • 3,825
  • 3
  • 31
  • 51
0

There is a good guide on the rstudio website: https://dplyr.tidyverse.org/articles/programming.html

Yours is more simple, but they can get trickier and you might need the quasiquotation.

This seems to work though.

library(tidyverse)
library(rlang)

data <- data.frame(groupname = c('A', 'B', 'A', 'A', 'B', 'B'), 
               value = c(1, 3, 4, 2, 1.4, 5))
data <-as.tibble(data)
data

my_mutate_like_function <- function(data) {
  data <- data %>%  mutate(mean.val = mean(value)) %>% 
  mutate (category = ifelse(value <= mean.val, "in", "out"))
  data
}

new.df <- my_mutate_like_function(data)
new.df
william3031
  • 1,653
  • 1
  • 18
  • 39