Questions tagged [reservoir-sampling]
17 questions
38
votes
4 answers
Reservoir sampling
To retrieve k random numbers from an array of undetermined size we use a technique called reservoir sampling. Can anybody briefly highlight how it happens with a sample code?

Jony
- 6,694
- 20
- 61
- 71
8
votes
3 answers
re-using random number in reservoir sampling
It was asked in relation to another question recently: Given an unknown length list, return a random item in it by scanning it only 1 time
I know you shouldn't, I just can't put my finger on a canonical explanation of why not.
Look at the example…

Will
- 73,905
- 40
- 169
- 246
8
votes
2 answers
Iterative or Lazy Reservoir Sampling
I'm fairly well acquainted with using Reservoir Sampling to sample from a set of undetermined length in a single pass over the data. One limitation of this approach, in my mind, is that it still requires a pass over the entire data set before any…

Stankalank
- 1,507
- 1
- 12
- 12
4
votes
2 answers
Initialization of Weighted Reservoir Sampling (A-Chao implementation)
I am trying to implement A-Chao version of weighted reservoir sampling as shown in https://en.wikipedia.org/wiki/Reservoir_sampling#Algorithm_A-Chao
But I found that the pseudo-code described in wiki seems to be wrong, especially on the…

zzz
- 975
- 8
- 14
2
votes
1 answer
Infinite/Lazy Reservoir Sampling in Haskell
I tried to implement a simple reservoir sampling in haskell following http://jeremykun.com/2013/07/05/reservoir-sampling/ (note that the algorithm shown is possibly semantically incorrect)
According to this: Iterative or Lazy Reservoir Sampling lazy…

CMCDragonkai
- 6,222
- 12
- 56
- 98
1
vote
1 answer
constant memory reservoir sampling, O(k) possible?
I have an input stream, of size n, and I want to produce an output stream of size k that contains distinct random elements of the input stream, without requiring any additional memory for elements selected by the sample.
The algorithm I was going to…

markt1964
- 2,638
- 2
- 22
- 54
1
vote
1 answer
reservoir sampling understanding probaility
I am having trouble understanding the probabilities involved in reservoir sampling. Below is the sample code which I have seen used in almost everywhere:
1/*
2 S has items to sample, R will contain the result, K number of items to select
…

noman pouigt
- 906
- 11
- 25
1
vote
2 answers
Reservoir Sampling unable to understand probability
To make clear following is the question:
Given an input stream of an indeterminate length, how do you return a random member of that stream (with equal probability for each), given that you're not allowed to store more than a constant number of…

AdityaKapreShrewsburyBoston
- 1,143
- 2
- 16
- 37
1
vote
1 answer
Reservoir Sampling on large Streams
I am trying to implement a reservoir sampling algorithm using java. I have N streams of data ( readings from sensors arriving at a sink node ) of unknown size. For the sake of simplicity lets assume i have one stream of unknown size.
So what one of…

Silas
- 125
- 2
- 11
1
vote
1 answer
Test Case for Weighted Reservoir Sampling
I need to implement Weighted Reservoir Sampling. I have referred to the paper mentioned in this blog. I want to write test-cases for unit testing my implementation and am confused as to how to calculate expected-probability of different element to…

Amit
- 614
- 1
- 8
- 18
1
vote
3 answers
reservoir sampling problem
This MSDN article proves the correctness of Reservoir Sampling algorithm as follows:
Base case is trivial. For the k+1st
case, the probability a given
element i with position <= k is in R
is s/k.
The probability i is replaced is the probability…

Lazer
- 90,700
- 113
- 281
- 364
1
vote
2 answers
Reservoir Sampling Algorithm
I want to understand the reservoir sampling algorithm where we select k elements out of the given set of S elements such that k <= S.
In the algorithm given on wiki:
array R[k]; // result
integer i, j;
// fill the reservoir array
for each i in…

ASingh
- 475
- 1
- 13
- 24
0
votes
0 answers
How to implement reservoir sampling in postgresql c language by uers-defined aggregate
I'd like to create a user-defined aggregate to implement reservoir sampling. Here are my codes. I want to use the bytea to store the address of the struct pointer which stores the state of the reservoir sampling. However, there are some bugs.
For…

Leo
- 1
- 1
0
votes
0 answers
Reservoir Sampling, except I can't hold the sample in memory
Is there an algorithm such as Reservoir Sampling (algorithm that randomly chooses an item from a stream such that each item is equally likely to be selected), but once an item is selected it is yielded (and therefore cannot be overridden)?
i.e, not…

Stack Overflow
- 377
- 4
- 16
0
votes
1 answer
Reservoir Sampling algorithm not wroking
I have a project for my data mining class, in which I have to code the reservoir sampling algorithm for files. The program takes as input a number k, the name of the input file, and the name of the output file to be created. The output file must…

GGre
- 3
- 1