Transforming numbers in R

Question

I have a specific need to "transform" a number in R. As an example,

A "floor" operation behave as:

138  -> 100
1233 -> 1000

A "ceiling" operation behave as:

138  -> 200
1233 -> 2000

Is there an easy way to accomplish this in R? thanks

Are the numbers going to be singular? Or is there a vector of numbers that you plan on doing this to? — Brandon Bertelsen, May 09 '13 at 14:21
what do you do with negative numbers? or smaller than 1, eg 1.2e-3? — baptiste, May 09 '13 at 14:49

sgibb · Accepted Answer · 2013-05-10T03:56:46.297

You could extract the exponent separatly:

floorEx <- function(x) {
  ex <- 10^trunc(log10(x))
  return(trunc(x/ex)*ex)
}

ceilingEx <- function(x) {
  ex <- 10^trunc(log10(x))
  return(ceiling(x/ex)*ex)
}

Examples:

floorEx(123)
# [1] 100

ceilingEx(123)
# [1] 200

ceilingEx(c(123, 1234, 12345))
# [1]   200  2000 20000

EDIT:

using trunc instead of floor and integrate old ex function (ex <- function(x)floor(log10(x))) to speedup the calculation a little bit
add benchmark to compare against @eddi's floorR

benchmark:

## provided by @eddi
floorR <- function(x) {r <- signif(x, 1); r - (r > x) * 10^trunc(log10(x))}

library("microbenchmark")

x <- 123; microbenchmark(floorEx(x), floorR(x), signif(x), times=1e4)
# Unit: nanoseconds
#        expr  min   lq median     uq    max neval
#  floorEx(x) 2182 2414   2521 2683.0 704190 10000
#   floorR(x) 2894 3150   3278 3505.5  22260 10000
#   signif(x)  372  472    507  556.0  10963 10000

x <- 1:1000; microbenchmark(floorEx(x), floorR(x), signif(x), times=1e2)
# Unit: microseconds
#        expr     min       lq   median       uq      max neval
#  floorEx(x) 100.560 101.2460 101.6945 115.6385  818.895   100
#   floorR(x) 354.848 355.4705 356.0420 375.9210 1074.582   100
#   signif(x) 114.608 115.2120 115.4695 119.1805  186.738   100

@G.Grothendieck -- But that would give wrong answers for, e.g., floorEx() with any numbers in `c(750:799, 851:899, 950:999)`, among many others... — Josh O'Brien, May 09 '13 at 16:14
@Josh, Good point. Combine this answer with Matthew's answer instead and replace the first line of each ...Ex function with `expo <- nchar(x) - 1` . — G. Grothendieck, May 09 '13 at 19:15
@G.Grothendieck -- For the record, here's what I'd have posted if there weren't already 4 answers up when I got here :) `lx <- log10(x); floor(10^(lx %% 1)) * 10^(lx %/% 1)`. — Josh O'Brien, May 09 '13 at 19:38
it's interesting that `ifelse` is so slow (it's actually a lot slower than your bench on my pc) — eddi, May 09 '13 at 20:53

score 8 · Answer 2 · answered May 09 '13 at 14:25

8

It does not directly answer your question, but you can also take a look at signif :

R> x <- 138
R> signif(x,1)
[1] 100
R> x <- 1712
R> signif(x,1)
[1] 2000

answered May 09 '13 at 14:25

juba

47,631
14
113
118

score 5 · Answer 3 · answered May 09 '13 at 14:48

5

Another option:

floor2 <- function(x) {
    mag <- 10^(nchar(round(x))-1)
    (x %/% mag) * mag
}

ceil2 <- function(x) {
    mag <- 10^(nchar(round(x))-1)
    ((x + mag) %/% mag) * mag
}

answered May 09 '13 at 14:48

Matthew Plourde

43,932
7
96
113

Nice approach. My favorite thus far +1 – Tyler Rinker May 09 '13 at 14:58

score 1 · Answer 4 · answered May 09 '13 at 14:43

I played with regexing and the ceiling floor functions to get this one:

ceil <- function(x) {
    ceiling(as.numeric(sub("([[:digit:]])", "\\1.", x))) * (10^(nchar(x)-1))
}

flr <- function(x) {
    floor(as.numeric(sub("([[:digit:]])", "\\1.", x))) * (10^(nchar(x)-1))
}


ceil(1233)
ceil(138)
flr(1233)
flr(138)


## > ceil(1233)
## [1] 2000
## > ceil(138)
## [1] 200
## > flr(1233)
## [1] 1000
## > flr(138)
## [1] 100

eddi · Answer 5 · 2013-05-09T21:16:18.043

Here's a different take using @juba's suggestion. To get from a rounded answer to the floor or ceil we simply need to correct it a little bit:

floorR = function(x) {
  rounded = signif(x, 1);

  rounded - (rounded > x) * 10^trunc(log10(x))
}

ceilR = function(x) {
  rounded = signif(x, 1);

  rounded + (rounded < x) * 10^trunc(log10(x))
}

edit2: after vectorizing, the functions are a little bit slower (see edit history for non-vectorized versions). They are still fast for small vectors, but don't scale as well as @sgibb's solution (partly because signif doesn't scale that well):

x = 156; microbenchmark(floorEx(x), flr(x), floor2(x), signif(x), floorR(x), times = 10000)
#Unit: nanoseconds
#       expr   min     lq median     uq     max neval
# floorEx(x)  4008   8348  10018  12021  158934 10000
#     flr(x) 84810 121204 135896 141571 6708248 10000
#  floor2(x) 32055  46078  51086  54091  360606 10000
#  signif(x)     0   1002   1336   1671   86813 10000
#  floorR(x)  3006   6679   8348  10017  207683 10000

x = c(1:1000); microbenchmark(floorEx(x), signif(x), floorR(x), times = 100)
#Unit: microseconds
#       expr     min       lq  median       uq     max neval
# floorEx(x) 125.879 157.4315 158.934 161.4385 243.742   100
#  signif(x) 147.581 216.6975 217.365 220.5375 395.998   100
#  floorR(x) 252.758 360.6055 362.275 366.4485 619.373   100

Sadly your `floorR` isn't vectorized (`floorR(c(123, 156)) # 100, 200`). I add a vectorized version of your `floorR` to my answer (for comparison). — sgibb, May 09 '13 at 20:37
@sgibb true; in your benchmarks you should probably modify `floorR` to use `trunc` as well, to make the comparison fair — eddi, May 09 '13 at 20:48
@sgibb ok - it's weird then that on my pc it's always better than `floorEx` (for a single number), but not so on yours — eddi, May 09 '13 at 21:01
Great improvement. On my laptop (Intel Core i5-2520M CPU @ 2.50GHz; 4 Gb RAM; R 2.15.1) `floorEx` is always faster than `floorR`. — sgibb, May 10 '13 at 03:57

Transforming numbers in R

5 Answers5

Linked