1

as a newbie I tried to define my own function to calculate factorial. I have managed to construct the function which works perfectly for numbers.

fact1 = function(x){
    a=1 
    for(i in 1:x){
        a = a*i
    }
    return(a)
}   

factorial = function(x){
    ifelse(x>=0 & round(x) == x , fact1(as.integer(x)),"NA")
}

However, how can I improve it in a way you can input a vector into it and it computes factorial of each element?

SMA
  • 36,381
  • 8
  • 49
  • 73
Lukas Tomek
  • 96
  • 12

5 Answers5

2

Adding to the lapply comment above, you can also use vapply or sapply to return a vector rather than list:

vapply(c(1, 2, 3),
       factorial, 
       FUN.VALUE = numeric(1))

[1] 1 2 6
Andrew Royal
  • 336
  • 1
  • 5
2

This seems to be a perfect case for Vectorize: just use Vectorize around the definition of your factorial function to make it, well, vectorized over its input.

fact1 = function(x){
  a=1 
  for(i in 1:x){
    a = a*i
  }
  return(a)
}   

factorial = Vectorize(function(x){
  ifelse(x>=0 & round(x) == x , fact1(as.integer(x)),"NA")
})

factorial(c(1,2,3))
#> [1] 1 2 6
Ramiro Magno
  • 3,085
  • 15
  • 30
2

The question answers seem to be slightly overcomplicated. Factorial is already a function which exists, and this is vectorized as such if you had some data you could simply put it into the function. If you want to define negative numbers to return 0 this could also be incorporated by using a logical statement. Note that i am using the buildin function factorial below rather than the one in the question.

dat <- round(runif(1000, -10, 10))
dat_over_zero <- dat > 0 
fact_vector <- numeric(1000)
fact_vector <- factorial(dat[dat_over_zero])

Now if you are simply creating an exercise to learn, you could vectorize the function quite simply, avoiding unnecessary for loops, using the same idea. Simply use one loop and iterate every element in the vector during this loop.

R_factorial <- function(x){
  if(!is.numeric(x) || length(dim(x)))
    stop("X must be a numeric vector!")
  #create an output vector
  output <- numeric(NROW(x))
  #set initial value
  output[x >= 1] <- 1
  output[x < 1] <- NA
  #Find the max factor (using only integer values, not gamma approximations)
  mx <- max(round(x))
  #Increment each output by multiplying the next factor (only on those which needs to be incremented) 
  for(i in seq(2, mx)){
    output[x >= i] <- output[x >= i] * i
  }
  #return output
  output
}

A few things to note:

  1. Allocate the entire vector first using output <- numeric(length), where length is the number of outputs (eg. length(x) here or more generally NROW(x)).
  2. Use the R constant NA for none numeric values instead of "NA". The first is recognized as a number, while the latter will change your vector in a character vector.

Now the alternative answers suggest lapply or vapply. This is more or less the same as looping over every value in the vector and using the function on each value. As such it is often a slow (but very readable!) way to vectorize a function. If this can be avoided however you can often gain a speed boost. For loops and apply is not necessarily bad, but it is in general alot slower compared to vectorized functions. See this stackoverflow page which explains why in a very easily understood manner. An additional alternative is using the Vectorize function which has been suggested. This is a quick-and-dirty solution. In my experience it is often slower than performing a simple loop, and it might have some unexpected side effects on multiple argument functions. It is not necessarily bad as often one gains in readability of the underlying code.


Speed comparison

Now the vectorized version is a lot faster compared to the alternative answers. Using the microbenchmark function from the microbenchmark package, we can see exactly how much faster. Below shows just how much (Note here i am using the factorial function in the question description):

microbenchmark::microbenchmark(R_factorial = R_factorial(x),
                               Vapply = vapply(x,
                                              factorial, 
                                              FUN.VALUE = numeric(1)),
                               Lapply = lapply(x, factorial),
                               Vfactorial = Vfactorial(x))
Unit: microseconds
        expr       min        lq      mean    median       uq       max neval
 R_factorial   186.525   197.287  232.2394  212.9565  241.464   395.706   100
      Vapply  2209.982  2354.596 3004.9264 2428.7905 3842.265  6165.144   100
      Lapply  2182.041  2299.092 2584.3881 2374.9855 2430.867  5061.852   100
Vfactorial(x) 2381.027 2505.4395 2842.9820 2595.3040 2669.310  5920.094   100

As one can see R_factorial is roughly 11 - 12 times faster compared to vapply or lapply (2428.8 / 212.96 = 11.4). This is quite a huge speed boost. Additional improvements could be done to speed it up even further (eg. using factorial approximation algorithms, Rcpp and other options), but for this example it might suffice.

Oliver
  • 8,169
  • 3
  • 15
  • 37
0

Use lapply function

lapply(c(1,2,3),factorial)
[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 6

R Documentation for lapply function

Stupid_Intern
  • 3,382
  • 8
  • 37
  • 74
0

You can also use the type safe purrr::map_dbl-function:

purrr::map_dbl(c(1,2,3), fact1)

[1] 1 2 6

ChristianL
  • 140
  • 7