23

In Mathematica there is the command Clip[x, {min, max}] which gives x for min<=x<=max, min for x<min and and max for x>max, see

http://reference.wolfram.com/mathematica/ref/Clip.html (mirror)

What would be the fastest way to achieve this in R? Ideally it should be a function that is listable, and should ideally work on either a single value, vector, matrix or dataframe...

TylerH
  • 20,799
  • 66
  • 75
  • 101
Tom Wenseleers
  • 7,535
  • 7
  • 63
  • 103

4 Answers4

25

Rcpp has clamp for this:

cppFunction('NumericVector rcpp_clip( NumericVector x, double a, double b){
    return clamp( a, x, b ) ;
}')

Here is a quick benchmark showing how it performs against other methods discussed :

pmin_pmax_clip <- function(x, a, b) pmax(a, pmin(x, b) )
ifelse_clip <- function(x, a, b) {
  ifelse(x <= a,  a, ifelse(x >= b, b, x))
}
operations_clip <- function(x, a, b) {
  a + (x-a > 0)*(x-a) - (x-b > 0)*(x-b)
}
x <- rnorm( 10000 )
require(microbenchmark)

microbenchmark( 
  pmin_pmax_clip( x, -2, 2 ), 
  rcpp_clip( x, -2, 2 ), 
  ifelse_clip( x, -2, 2 ), 
  operations_clip( x, -2, 2 )
)
# Unit: microseconds
#                        expr      min        lq   median        uq       max
# 1     ifelse_clip(x, -2, 2) 2809.211 3812.7350 3911.461 4481.0790 43244.543
# 2 operations_clip(x, -2, 2)  228.282  248.2500  266.605 1120.8855 40703.937
# 3  pmin_pmax_clip(x, -2, 2)  260.630  284.0985  308.426  336.9280  1353.721
# 4       rcpp_clip(x, -2, 2)   65.413   70.7120   84.568   92.2875  1097.039    
Romain Francois
  • 17,432
  • 3
  • 51
  • 77
  • 1
    Those times are pretty rockin'. – Dirk Eddelbuettel Dec 13 '12 at 23:52
  • Just pasting the lines for the clamp code in a console session is obviously not what you intended us Rcpp virgins to be doing. – IRTFM Dec 13 '12 at 23:53
  • Almost. See my use of `cppFunction` in my edit. (but you need the current devel version of `Rcpp` because `clamp` has been fixed since the last release). – Romain Francois Dec 13 '12 at 23:56
  • Very cool. I'm shocked and baffled at how bad the `operations_clip()` times are .... *sometimes*. Any ideas why the **max** values are quite so much larger than the **min** values for all of these functions? – Josh O'Brien Dec 14 '12 at 03:42
  • I'm pretty sure this is about memory allocation. `operations_clip` performs a lot of them, so my guess is that sometimes it takes longer. – Romain Francois Dec 14 '12 at 07:39
23

Here's a method with nested pmin and pmax setting the bounds:

 fenced.var <- pmax( LB, pmin( var, UB))

It will be difficult to find a method that is faster. Wrapped in a function that defaults to a range of 3 and 7:

fence <- function(vec, UB=7, LB=3) pmax( LB, pmin( vec, UB))

> fence(1:10)
 [1] 3 3 3 4 5 6 7 7 7 7
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Very elegant - that's great! – Tom Wenseleers Dec 13 '12 at 22:51
  • I use this one a lot. I have a large dataset that has several variables that are not plausibly real below 0 and that should be sensible constrained at the high end as well. The real trick is remembering to set the max with `pmin` and set the min with `pmax`. – IRTFM Dec 13 '12 at 22:52
  • 2
    Your 'It will be difficult to find a method that is faster' obviously motivated me to have a look. – Romain Francois Dec 13 '12 at 23:22
  • Yeah. It still wins the compactness prize ... so far. – IRTFM Dec 13 '12 at 23:57
  • I think the function's arguments UB and LB should be reversed. I suspect `fence <- function(vec, LB=3, UB=7) pmax( LB, pmin( vec, UB))` is really what you're after – Matthew Walker Oct 09 '13 at 02:43
  • @jf328: You might want to present a counter-example since my experiments showed it to be working in what I thought was a perfectly reasonable manner. `pmin( pmax( matrix(1:10,2), 4), 6)` Probably best to do this with a new question. – IRTFM Nov 30 '16 at 17:02
  • 1
    @42- interesting. You have to put the scalar (4 and 6) as the second argument. If you put them as the first argument as in your answer, then it returns a vector, not a matrix. And the pmax/pmin document does say it only returns a vector. – jf328 Dec 01 '16 at 09:35
11

Here's one function that will work for both vectors and matrices.

myClip <- function(x, a, b) {
    ifelse(x <= a,  a, ifelse(x >= b, b, x))
}

myClip(x = 0:10, a = 3,b = 7)
#  [1] 3 3 3 3 4 5 6 7 7 7 7

myClip(x = matrix(1:12/10, ncol=4), a=.2, b=0.7)
# myClip(x = matrix(1:12/10, ncol=4), a=.2, b=0.7)
#      [,1] [,2] [,3] [,4]
# [1,]  0.2  0.4  0.7  0.7
# [2,]  0.2  0.5  0.7  0.7
# [3,]  0.3  0.6  0.7  0.7

And here's another:

myClip2 <- function(x, a, b) {
    a + (x-a > 0)*(x-a) - (x-b > 0)*(x-b)
}

myClip2(-10:10, 0, 4)
# [1] 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 4 4 4 4 4 4
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
4

I believe that would be clamp() from the raster package.

library(raster)
clamp(x, lower=-Inf, upper=Inf, ...)
wordsforthewise
  • 13,746
  • 5
  • 87
  • 117