8

Does anyone know how to sort a vector in R by absolute value, so (-2, 3, 1) -> (1, -2, 3) etc?

If I were doing it in python I'd create a pair of every value and its sign, sort the list of pairs by the absolute value then reapply the sign, but I'm very new to R so have no idea how to do this.

Cheers

sds
  • 58,617
  • 29
  • 161
  • 278
kezz_smc
  • 115
  • 1
  • 1
  • 5

2 Answers2

11

@Arun's method is TRT:

v[order(abs(v))]

where v is the vector to be sorted.

Notes:

  • This creates a new vector abs(v) of the same size as v. This is not very memory-efficient, but I don't think this can be avoided in R, like it is done in, e.g., Lisp: (sort #'< v :key #'abs) or Python: v.sort(key=abs).
  • This temp vector allocation is not necessarily a bad thing: you do lose memory, but you win time because the accessor key is called only N times, not N*log(N) times, which is especially important when the key is not cheap (unlike abs or a structure field).
  • To be more precise, the vector abs(v) is garbage collected very soon, but its allocation (and, especially, garbage collection) are expensive for large vectors and may be actually problematic if the memory is tight.

See also:

Community
  • 1
  • 1
sds
  • 58,617
  • 29
  • 161
  • 278
  • But that vector is ephemeral, isn't it? – Carl Witthoft Oct 17 '13 at 19:18
  • @Carl For sorting? Is that even possible? – Konrad Rudolph Oct 17 '13 at 19:20
  • @KonradRudolph I may have used the wrong term-- isn't it true that the vector `abs(v)` is not in the parent environment & will disappear on the next garbage collection? So it may affect peak RAM but won't be permanent. – Carl Witthoft Oct 17 '13 at 19:23
  • @CarlWitthoft: of course it can be GCed right away, but neither allocation nor collection are cheap for huge vectors and may fragment memory for small ones. – sds Oct 17 '13 at 19:25
  • Got it. On to a http://en.wikipedia.org/wiki/Bubble_sort method (no, not really :-) ) – Carl Witthoft Oct 17 '13 at 19:30
  • @Carl Ah, I thought you were referring to a [lazily evaluated vector](https://en.wikipedia.org/wiki/Lazy_evaluation#Working_with_infinite_data_structures), i.e. it not being in memory at all. As sds said, whether it’s garbage collected immediate or sticks around in the parent frame doesn’t matter from a performance point of view (GC’ing is quite expensive, and if no memory is available in RAM it may have to be swapped to disk, which is also slow). – Konrad Rudolph Oct 17 '13 at 19:31
  • We are talking about a *vector* here. If the OP has issues to store a vector (even of size 1e7+) and create a copy of that vector, then probably he should be thinking about data structures to store the vector efficiently in the first place (RLE or what not). With the hardware prices now-a-days, this is almost never an issue IMHO. – Arun Oct 17 '13 at 19:41
  • @Arun: I have been hearing this "HW is cheap" mantra for 15+ years; I have been struggling with RAM scarcity on my box all this time. Yes, HW is _relatively_ cheap, but that is no excuse for wasteful RAM usage, just like it is not excuse for, e.g., using quadratic instead of linearithmic algorithms. – sds Oct 17 '13 at 19:47
  • @sds, I guess my limits for compromise between speed and memory usage and more importantly spending a bit on hardware (especially when working on big-data and using R) are much different from yours (I dint advocate using quadratic time to linear time algorithms at all). – Arun Oct 17 '13 at 19:51
  • @Arun: I don't think we have anything to argue about. – sds Oct 17 '13 at 20:21
2

I found it useful to package this in a function so that I could pass a vector to it, and also so that there was the option of using other options in the order function like decreasing. It is essentially based on the existing answer.

sort_abs <- function(x, na.last = TRUE, decreasing = FALSE) {
    x[order(abs(x), na.last = na.last, decreasing = decreasing)] 
}

For example,

> sort_abs(c(-1,NA,2,-2))
[1] -1  2 -2 NA
> sort_abs(c(-1,NA,2,-2), decreasing = TRUE, na.last = FALSE)
[1] NA  2 -2 -1
Community
  • 1
  • 1
Jeromy Anglim
  • 33,939
  • 30
  • 115
  • 173