7

?sort states that the partial argument may be NULL or a vector of indices for partial sorting.

I tried:

x <- c(1,3,5,2,4,6,7,9,8,10)
sort(x)
## [1]  1  2  3  4  5  6  7  8  9 10
sort(x, partial=5)
## [1]  1  3  4  2  5  6  7  9  8 10
sort(x, partial=2)
## [1]  1  2  5  3  4  6  7  9  8 10
sort(x, partial=4)
## [1]  1  2  3  4  5  6  7  9  8 10

I am not sure what partial means when sorting a vector.

gagolews
  • 12,836
  • 2
  • 50
  • 75
useR
  • 3,062
  • 10
  • 51
  • 66
  • http://en.wikipedia.org/wiki/Partial_sorting – Alex Reynolds May 10 '14 at 08:51
  • so in breif, for sort(x, partial=m) means arrange the vector x such that its first m position contain the m smallest elements (in ascending order.) <<< not necessary in ascending order? – useR May 10 '14 at 09:08
  • The meaning of `partial` in R's `sort` is quite different from the one presented in an Wikipedia article, see below. – gagolews May 10 '14 at 09:45

2 Answers2

10

As ?sort states,

If partial is not NULL, it is taken to contain indices of elements of the result which are to be placed in their correct positions in the sorted array by partial sorting.

In other words, the following assertion is always true:

 stopifnot(sort(x, partial=pt_idx)[pt_idx] == sort(x)[pt_idx])

for any x and pt_idx, e.g.

 x <- sample(100) # input vector
 pt_idx <- sample(1:100, 5) # indices for partial arg

This behavior is different from the one defined in the Wikipedia article on partial sorting. In R sort()'s case we are not necessarily computing k smallest elements.

For example, if

print(x)
##  [1]  91  85  63  80  71  69  20  39  78  67  32  56  27  79   9  66  88  23  61  75  68  81  21  90  36  84  11   3  42  43
##  [31]  17  97  57  76  55  62  24  82  28  72  25  60  14  93   2 100  98  51  29   5  59  87  44  37  16  34  48   4  49  77
##  [61]  13  95  31  15  70  18  52  58  73   1  45  40   8  30  89  99  41   7  94  47  96  12  35  19  38   6  74  50  86  65
##  [91]  54  46  33  22  26  92  53  10  64  83

and

pt_idx
## [1]  5 54 58 95  8

then

sort(x, partial=pt_idx)
##  [1]   1   3   2   4   5   6   7   8  11  12   9  10  13  15  14  16  17  18  23  30  31  27  21  32  36  34  35  19  20  37
## [31]  38  33  29  22  26  25  24  28  39  41  40  42  43  48  46  44  45  47  51  50  52  49  53  54  57  56  55  58  59  60
## [61]  62  64  63  61  65  66  70  72  73  69  68  71  67  79  78  82  75  81  80  77  76  74  89  85  88  87  83  84  86  90
## [91]  92  93  91  94  95  96  97  99 100  98

Here x[5], x[54], ..., x[8] are placed in their correct positions - and we cannot say anything else about the remaining elements. HTH.

EDIT: Partial sorting may reduce the sorting time, of course if you are interested in e.g. finding only some of the order statistics.

require(microbenchmark)
x <- rnorm(100000)
microbenchmark(sort(x, partial=1:10)[1:10], sort(x)[1:10])
## Unit: milliseconds
##                           expr       min        lq    median        uq      max neval
##  sort(x, partial = 1:10)[1:10]  2.342806  2.366383  2.393426  3.631734 44.00128   100
##                  sort(x)[1:10] 16.556525 16.645339 16.745489 17.911789 18.13621   100
gagolews
  • 12,836
  • 2
  • 50
  • 75
  • Whats the reason to employ partial sort? Reduce calculation time? – useR May 10 '14 at 11:54
  • It is difficult (for me) to find any other reason (perhaps others may indicate some arguments?). See the edited answer for a benchmark. – gagolews May 10 '14 at 12:04
  • Thx again for yr answer. The reason trigger me to think this question is that i found that partial sort within median function. – useR May 11 '14 at 01:18
  • 1
    @Acqua has noted a critical correction to this answer, which I'll add as my own answer so I have access more syntax highlighting and formatting – rcorty Jul 04 '17 at 16:41
3

regarding the statement "Here x[5], x[54], ..., x[8] are placed in their correct positions", I don't think it's correct, it should be "in the result, i.e. sorted x, result[5], result[54],.....,result[8], will be placed with right values from x."

quote from R manual:

If partial is not NULL, it is taken to contain indices of elements of the result which are to be placed in their correct positions in the sorted array by partial sorting. For each of the result values in a specified position, any values smaller than that one are guaranteed to have a smaller index in the sorted array and any values which are greater are guaranteed to have a bigger index in the sorted array.

asalimih
  • 298
  • 2
  • 8
Acqua
  • 51
  • 2