28

I have a vector with varying length, which can sometimes be of length 1.

I would like to sample from this vector such that if its length is 1 it always samples that 1 number.

sample() won't do this because it samples from 1 to the digit when sample size is 1.

Henrik
  • 65,555
  • 14
  • 143
  • 159
user1723765
  • 6,179
  • 18
  • 57
  • 85
  • 7
    The answer for this question is the help file for `sample` see `?sample` and read carefully the 'Details' section, here you can find `If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x.` – Jilber Urbina Dec 21 '12 at 12:06
  • 3
    and is there any way os making it sample only that single value? – user1723765 Dec 21 '12 at 12:09

4 Answers4

30

This is a documented feature:

If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x. Note that this convenience feature may lead to undesired behaviour when x is of varying length in calls such as sample(x).

An alternative is to write your own function to avoid the feature:

sample.vec <- function(x, ...) x[sample(length(x), ...)]
sample.vec(10)
# [1] 10
sample.vec(10, 3, replace = TRUE)
# [1] 10 10 10

Some functions with similar behavior are listed under seq vs seq_along. When will using seq cause unintended results?

Community
  • 1
  • 1
flodel
  • 87,577
  • 21
  • 185
  • 223
  • We posted the same answer at the same moment apparently. Hence I moved my answer to the other identical question and voted to close-merge both questions. – Joris Meys Dec 21 '12 at 12:29
  • Thanks @Joris. I thought my answer had a little more than yours so feel free to salvage if you agree. I would rather have closed the newer one but I don't know if there is a policy in place in such cases. – flodel Dec 21 '12 at 12:37
  • 1
    I've upvoted yours already :). I flagged the question for merging, so when that happens all answers are added together. I meant to close the other, but apparently I clicked "vote to close" in the wrong window. My mistake, sorry. – Joris Meys Dec 21 '12 at 12:40
20

When fed only one single number, sample works like sample.int (see ?sample). If you want to make sure it only samples from the vector you give it, you can work with indices and use this construct:

x[sample(length(x))]

This gives you the correct result regardless the length of x, and without having to add an if-condition checking the length.

Example:

mylist <- list(
  a = 5,
  b = c(2,4),
  d = integer(0)
)

mysample <- lapply(mylist,function(x) x[sample(length(x))])

> mysample
$a
[1] 5

$b
[1] 2 4

$d
integer(0)

Note : you can replace sample by sample.int to get a little speed gain.

Joris Meys
  • 106,551
  • 31
  • 221
  • 263
1

You could use this 'bugfree' redefinition of the function:

sample = function(x, size, replace = F, prob = NULL) {
  if (length(x) == 1) return(x)
  base::sample(x, size = size, replace = replace, prob = prob)
}

Test it:

> sapply(1:7, base::sample, size = 1)
[1] 1 2 2 4 4 4 4
> sapply(1:7, sample)
[1] 1 2 3 4 5 6 7
CoderGuy123
  • 6,219
  • 5
  • 59
  • 89
0

You can use resample() from the gdata package. This saves you having to redefine resample in each new script. As calling library(gdata) masks a few functions and comes up with a few messages, you may consider using the double colon notation to call with gdata::resample().

https://www.rdocumentation.org/packages/gdata/versions/2.18.0/topics/resample

trickytank
  • 46
  • 4