1

i'm just getting started playing with some R. A initial exercise was first to print the lynx dataset:

> print(lynx)
Time Series:
Start = 1821 
End = 1934 
Frequency = 1 
  [1]  269  321  585  871 1475 2821 3928 5943 4950 2577  523   98  184  279  409 2285
 [17] 2685 3409 1824  409  151   45   68  213  546 1033 2129 2536  957  361  377  225
 [33]  360  731 1638 2725 2871 2119  684  299  236  245  552 1623 3311 6721 4254  687
 [49]  255  473  358  784 1594 1676 2251 1426  756  299  201  229  469  736 2042 2811
 [65] 4431 2511  389   73   39   49   59  188  377 1292 4031 3495  587  105  153  387
 [81]  758 1307 3465 6991 6313 3794 1836  345  382  808 1388 2713 3800 3091 2985 3790
 [97]  674   81   80  108  229  399 1132 2432 3574 2935 1537  529  485  662 1000 1590
[113] 2657 3396

And then I should order them in increasing order. I quickly find the order() function which looks like it does what it should (the correct answers also indicate this).

So I do this:

> print(order(lynx))
  [1]  69  22  70  71  23  68  99  98  12  78 100  21  79  13  72  59  24  32  60 101
 [21]  41  42  49   1  14  40  58   2  88  51  33  30  31  73  89  80  67 102  15  20
 [41]  61  50 109  11 108  25  43   3  77 110  97  39  48  34  62  57  81  52  90   4
 [61]  29 111  26 103  74  82  91  56   5 107 112  53  44  35  54  19  87  63  38  27
 [81]  55  16 104  66  28  10 113  17  92  36  64   6  37 106  95  94  45 114  18  83
[101]  76 105  96  86  93   7  75  47  65   9   8  85  46  84

And this does something in rearranging the numbers, but it does'nt really look sorted at all. if we just take it from the start, it goes 69, then down to 22, and then up to 70. It does'nt seem sorted at all.

What does order do? if not sort?

  • You need `lynx[order(lynx)]` – akrun Sep 04 '21 at 21:47
  • I often think the function should be named `order_of` (returning a calculated property of the data) versus `order`, a misleading verb. – r2evans Sep 04 '21 at 22:16
  • These posts might be a useful read https://stackoverflow.com/questions/54017285/difference-between-sort-rank-and-order and https://stackoverflow.com/questions/12289224/rank-and-order-in-r – Ronak Shah Sep 05 '21 at 00:54

1 Answers1

6

The order returns the position index ordered. We need to use that index to reorder the values and assign back to the same object with [] to keep the structure intact

lynx[] <- lynx[order(lynx)]

-output

> lynx
Time Series:
Start = 1821 
End = 1933 
Frequency = 1 
  [1]   80   91   97   99  125  140  157  217  265  358  366  387  389  413  417  450  490  519  527  539  599  624  715  723  775  905  905  911
 [29]  927  950  976  983  985  996 1087 1142 1152 1161 1175 1300 1318 1322 1330 1470 1486 1555 1557 1588 1657 1666 1666 1674 1712 1719 1743 1775
 [57] 1839 1942 1989 1990 1997 2006 2028 2052 2084 2140 2153 2163 2197 2234 2263 2322 2404 2502 2519 2530 2534 2545 2548 2585 2804 2857 2864 2866
 [85] 2870 2941 2949 2976 3058 3129 3136 3139 3154 3175 3192 3197 3201 3224 3294 3438 3452 3463 3504 3556 3623 3647 3661 3712 3770 3817 3870 3896
[113] 3922

Or may also use sort and then reassign

lynx[] <- sort(lynx)

Regarding what those order position vector implies is, let's consider the example created here. The first position from order is 70 i.e. in the new ordered data, the value that should be sorted would be from the 70th position of original time series

> order(lynx)
  [1]  70  26  35  38  42  19 103  84 111  75 109  74  10  63  29 112  94  85  80  31  15  49  39  88  17  46 101  69  24  98  54  20   8  27 113  56
 [37]  68 100   3 108   4  72  47  73  93  59  43  34  86  50 104 105  36  51 110  92  78  53  62  16 102  81   7  77  65  21  40  87  44  41  96  76
 [73]  25  83  28   2  95   5  23  99  37  11  57  64   1  45  52  18   6  91  30  13  97 106  67  55  58  14  60  32  48   9  22  82 107  90  89  71
[109]  33  66  79  12  61
> lynx[70] # value corresponds to 70th position is 80
[1] 80

data

set.seed(24)
lynx <- ts(sample(80:4000, 113, replace = TRUE), start = c(1821), frequency = 1)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • It's a bad idea to use `lynx[] <-` in the two examples, because the sorted values are no longer a time series. It makes sense to lose the time series attributes. – user2554330 Sep 05 '21 at 00:07
  • @user2554330 Have you looked at the `str(lynx) Time-Series [1:113]#` it is still a time series after the sort. I don't know what you are talking about – akrun Sep 05 '21 at 18:08
  • I am saying that the values in sorted order don't represent a time series, so you shouldn't tell the user they are by keeping it represented as one. It doesn't make any sense to say there was a value of 80 in 1821 as your sorted version says, because sorting the values doesn't make them happen in a different year. – user2554330 Sep 05 '21 at 18:16
  • @user2554330 that should be a comment to the OP rather than to the answer because I am just following what the OP wanted. – akrun Sep 05 '21 at 18:17
  • There could be many assumptions that can be made just like you had one. If the OP's time series got was having unordered values but with correct time stamps etc. – akrun Sep 05 '21 at 18:18
  • I think you are seeing something in the question that isn't there. It's a question about why `order(lynx)` doesn't put the values in order. As far as I can see, the fact that `lynx` was a time series is incidental. – user2554330 Sep 05 '21 at 18:20