1

I am trying to create a function which returns the min & max value of a vector.

Currently I have created 2 seperate functions but I need the one to return similar output like so. min max -2.078793 2.041260

Vector

vec <- rnorm(20)

Functions

minmax <- function(x) {
  my_min = Inf
  for (i in seq_along(x)) {
    if (x[i] < my_min) my_min = x[i]
  }
  return(min = my_min)
}


minmax <- function(x) {
  my_max = 0
  for (i in seq_along(x)) {
    if (x[i] > my_max) my_max = x[i]
  }
  return(max = my_max)
}

3 Answers3

1

Try this function

minmax <- function(x) {
    my_min = Inf 
    my_max = - Inf
    for (i in seq_along(x)) {
        if (x[i] < my_min) my_min = x[i]
        if (x[i] > my_max) my_max = x[i]
    }
    cat("min , max :" , my_min , " , " , my_max)
    invisible(c(min = my_min , max = my_max))
}
Mohamed Desouky
  • 4,340
  • 2
  • 4
  • 19
  • 3
    You had me until ending with a `cat` ... I think a function that operates solely in side-effect by printing something to the console and returning `NULL` has very limited utility. Whether or not you `cat(.)` some output is one thing (that I am strongly opinionated against), but at a minimum don't you think that a function looking for min/max values should `return` those values so that whatever called the function can use it? – r2evans Jul 10 '22 at 02:10
  • Sure you are write , i added the required values but returned invisibly , as if i customise the printing function. – Mohamed Desouky Jul 10 '22 at 08:58
1

Using first element as starting value.

f <- function(x) {
  r <- x[c(1L, 1L)]
  for (i in 2:length(x)) {
    if (x[i] < r[1L]) r[1L] <- x[i]
    if (x[i] > r[2L]) r[2L] <- x[i]
  }
  r
}

However, such loops are slow in R, but we could implement it using Rcpp,

rcppfun <- "
Rcpp::NumericVector myrange(Rcpp::NumericVector x) {
  std::vector<double> r(2);
  r[0] = x[0];
  r[1] = x[0];
  for (int i = 1; i < x.size(); ++i) {
    if (x[i] < r[0]) {
      r[0] = x[i];
    }
    if (x[i] > r[1]) {
      r[1] = x[i];
    }
  }
  return Rcpp::wrap(r);    
}
"

library(Rcpp)
f_rcpp <- cppFunction(rcppfun)

set.seed(42)
x <- rnorm(1e7)

stopifnot(all.equal(range(x), f(x)) & all.equal(range(x), f_rcpp(x)))

f(x)
# [1] -5.522383  5.537123

f_rcpp(x)
# [1] -5.522383  5.537123

which appears to be much faster than range(). The reason for this is that base:::range.default concatenates min(x) and min(x), i.e. essentially two for loops are used whereas f_rcpp uses only one. Notice, that f_rcpp also works with matrices f_rcpp(mat), and with data frames, f_rcpp(as.matrix(df)) works.

microbenchmark::microbenchmark(
  f(x), f_rcpp(x), range(x), minmax(x), times=3L
)
Unit: milliseconds
      expr        min         lq       mean     median         uq        max neval cld
      f(x) 1478.53334 1478.54111 1488.13588 1478.54889 1492.93715 1507.32542     3   b
 f_rcpp(x)   53.66378   53.77902   54.28918   53.89426   54.60187   55.30949     3  a 
  range(x)   97.38360  107.07452  113.62282  116.76545  121.74244  126.71942     3  a 
 minmax(x) 1443.86547 1444.31277 1484.25910 1444.76007 1504.45592 1564.15176     3   b
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • Which version of `minmax`? This [answer](https://stackoverflow.com/a/37238415/1422451) shows `tail` runs slower than `x[length(x)]`! – Parfait Jul 10 '22 at 14:28
  • 1
    @Parfait I anticipated that and used version w/o ,`tail`. – jay.sf Jul 10 '22 at 14:33
0

Consider head() or tail() after sorting:

minmax <- function(x) {
    sorted_vec <- sort(x)
    c(min=head(sorted_vec, 1), max=tail(sorted_vec, 1))
}

Alternatively, by indexing after sorting:

minmax <- function(x) {
  sorted_vec <- sort(x)
  c(min=sorted_vec[1], max=sorted_vec[length(x)])
}
Parfait
  • 104,375
  • 17
  • 94
  • 125