8

I have a list and I want to randomly select a node from it. As it's not an array I don't know in advance its length. Is there a way to select a node randomly (with uniform distribution) without having to scan the whole list (in the worst case) twice (i.e. to get its length and to reach the selected node after having randomly selected its position)?

Here's the code I use for my list:

struct mynode {
    in_addr_t paddr;
    struct mynode *prev, *next;
};

struct mylist {
    struct mynode *first, *last;
    char *name;
};
Robb1
  • 4,587
  • 6
  • 31
  • 60
  • 1
    Easy way: maintain a counter, while adding a(ny) node, update it. – Sourav Ghosh May 04 '17 at 09:11
  • 3
    Reservoir search (aka Knuth's algorithm R, though it was not invented by Knuth) But you still have to adapt it to your list or tree. And it will need exactly one scan over the entire structure. – joop May 04 '17 at 09:11
  • Thanks for your answers. @joop can you send me a link? I can find only "Knuth's algorithm X" on Google. – Robb1 May 04 '17 at 09:14
  • 4
    https://en.wikipedia.org/wiki/Reservoir_sampling – Ilja Everilä May 04 '17 at 09:16
  • 1
    Oops, not *search*, but **sampling** GIYF – joop May 04 '17 at 09:17
  • @IljaEverilä reservoir sampling needs to scan the list to the end, so it isn't quite clear if it answers the question. – n. m. could be an AI May 04 '17 at 09:21
  • 1
    This sounds contradictory; if the random process decides "the last element", you can't get there without traversing the entire list. If it therefore "can't" select that, then the selction is not homogeneous, right? I think the existance of reservoir search is the answer, that's what you need to do. – unwind May 04 '17 at 09:28
  • **Reservoir sampling** was definitely the answer I was looking for! Thanks guys – Robb1 May 04 '17 at 09:30

1 Answers1

1

As advised in the comments by joop and Ilja Everilä, I implemented the reservoir sampling in C.

struct mynode *select_mynode(struct mylist *list) {
    struct mynode *list_iter = list->first; // An iterator to scan the list
    struct mynode *sel_node = list_iter; // The node that will be selected
    int count = 2; // Temporary size of the list
    srand((unsigned int) time(NULL)); // Seed
    // Select a random element in O(n)
    while (list_iter->next != NULL) {
        if (rand() % count == (count - 1))
            sel_node = list_iter->next;
        list_iter = list_iter->next;
        count++;
    }
    return sel_node;
}

Note: for further information about random selection, see here.

Community
  • 1
  • 1
Robb1
  • 4,587
  • 6
  • 31
  • 60
  • 2
    That is about it. Minor nitpick: `if (rand() % count == (count - 1))` is not *unbiased* uniform random. – joop May 04 '17 at 11:15
  • Thanks again for your tip! I just added a note to the answer. Editing the code would have decreased the readability, I think – Robb1 May 04 '17 at 13:28
  • Given how atrocious `rand` generally is, complaining about the frankly trivial bias in the modulo is misplaced. – Veedrac May 07 '17 at 11:18