0

Please advise, I have the following simple function:

mifflin_equation <- function(gender = "M", 
                             w_kg = 50,
                             h_cm = 180, 
                             age = 40,
                             activity_type = "sedentary") {

  activity_types <- c("sedentary", "light", "moderate", "active")

  if (!(tolower(activity_type) %in% activity_types)) {

    activity_type <- "sedentary"

  }

  activity_trans_table <- tibble(type = activity_types,
                                 activity_coeff = c(1.2, 1.375, 
                                                    1.55, 1.725))

  activity_coeff <- activity_trans_table$activity_coeff[activity_trans_table$type == tolower(activity_type)]

  common_equation <- (10 * w_kg) + (6.25 * h_cm) - (5 * age)

  if (gender == "M") {

    return((common_equation + 5) * activity_coeff)

  } else if (gender == "F") {

    return((common_equation - 161) * activity_coeff)

  }

}

I am building some options:

age <- seq.int(30,90)
h <- seq.int(150, 200)
w <- seq.int(40, 150)
activity <- c("sedentary", "light", "moderate", "active")
gender <- c("M", "F")

all_options <- expand.grid(age = age, h = h, w = w, activity = activity, gender = gender)

But when I am trying to dplyr::mutate a calculated field of the above function I get first calculation ok and all NA's:

mifflin_options <- all_options %>%
  dplyr::mutate(mifflin_eq_calories = mifflin_equation(gender = gender, 
                                                       w_kg = w, 
                                                       h_cm = h,
                                                       age = age,
                                                       activity_type = activity))

I am aware that if it was only one variable I should use sapply, but what is the solution here?

SteveS
  • 3,789
  • 5
  • 30
  • 64
  • 1
    that should be resolved by adding `rowwise` or using `pmap`. Try doing `all_options %>% rowwise() %>% dplyr::mutate(mifflin_eq_calories = mifflin_equation(gender = gender, w_kg = w, h_cm = h, age = age, activity_type = activity))` Also do you really need to create a dataframe with 2762568 rows to reproduce your problem ? A 5-10 row dataframe would have also worked. – Ronak Shah May 13 '19 at 09:23
  • @RonakShah yes I already discovered it. Thanks. You can add an answer for others and I will accept, thanks :) – SteveS May 13 '19 at 09:34
  • @RonakShah I just showed what I have done, next time I will use minimal example. – SteveS May 13 '19 at 09:35
  • @RonakShah please add to your answer if it's possible to parallelize this calculation. – SteveS May 13 '19 at 09:36

2 Answers2

2

Here are some options which can help you get your expected output

library(dplyr)
library(purrr)
temp <- head(all_options)

1) rowwise

temp %>%
  rowwise() %>%
  mutate(mifflin_eq_calories = mifflin_equation(gender = gender, 
                                                   w_kg = w, 
                                                   h_cm = h,
                                                   age = age,
                                                   activity_type = activity))

2) pmap

temp %>% mutate(mifflin_eq_calories = pmap_dbl(
            list(gender, w, h, age, activity), mifflin_equation))

3) Base R mapply

mapply(mifflin_equation, temp$gender, temp$w, temp$h, temp$age, temp$activity)

4) Vectorize your function

new_fun <- Vectorize(mifflin_equation)

4a) apply using mutate

temp %>%
 mutate(mifflin_eq_calories = new_fun(gender = gender, 
                                      w_kg = w, 
                                      h_cm = h,
                                      age = age,
                                      activity_type = activity))

4b) Or directly

new_fun(temp$gender, temp$w, temp$h, temp$age, temp$activity)

5) data.table

library(data.table)
setDT(temp)[, ans:= mifflin_equation(gender, w, h, age, activity),by = 1:nrow(temp)]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

We can use Map from base R

temp <- head(all_options)
unlist(do.call(Map, c(f = mifflin_equation, temp)))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • What about parallel computing? – SteveS May 13 '19 at 13:47
  • @SteveS For that, you may need to use `parallel` or `snow` packages. In general, if the computation is not that iterative, you wouldn't get the benefit of parallel because there is a cost to do it – akrun May 13 '19 at 14:59
  • I can use ```multidplyr``` and split by one of the columns I think. – SteveS May 14 '19 at 08:28
  • If you can share an example using ```snow``` and ```parallel``` it will be great. @akrun – SteveS May 14 '19 at 08:29