0

I want to generate random number in sorted order. I wrote below code:

void CreateSortedNode(pNode head)
{
    int size = 10, last = 0;
    pNode temp;
    while(size-- > 0) {
        temp = (pnode)malloc(sizeof(struct node));
        last += (rand()%10);
        temp->data = last;//randomly generate number in sorted order
        list_add(temp);
    }
}

[EDIT:] Expecting number will be generated in increased or decreased order: i.e {2, 5, 9, 23, 45, 68 }

int main()
{
int size = 10, last = 0;
        while(size-- > 0) {
            last += (rand()%10);
            printf("%4d",last);
        }
return 0;
}

Any better idea?

iDebD_gh
  • 364
  • 5
  • 18
  • 3
    do not cast the return of `malloc()`, and you know this is not completely random, you are restricting the next number to be within `+10` of the last.. – Haris Oct 20 '14 at 13:57
  • 2
    What does "generate random number in sorted order" mean? What output do you expect? – interjay Oct 20 '14 at 13:59
  • 7
    First generate random numbers and then sort it. Random means Random :) – ani627 Oct 20 '14 at 13:59
  • list.cpp: In function 'int main(int, char**)': list.cpp:40: error: invalid conversion from 'void*' to 'node*' This error will come if I remove casting... – iDebD_gh Oct 20 '14 at 14:01
  • The reason for that is that your distribution will not be truly random. Sorry, generating and then sorting will take longer, but if you want random numbers, you're going to have to generate them as truly random. – Topological Sort Oct 20 '14 at 14:01
  • I searched alot with this qus but got no better Answer and expecting one good algo for it...Edited this question for better understanding.... – iDebD_gh Oct 20 '14 at 14:03
  • You might want to generate an array of random numbers one at a time then sort that array, this will then result in having an array with ordered numbers that was initially randomly generated. Is this type of solution essentially what you are looking for? – shuttle87 Oct 20 '14 at 14:04
  • @shuttle87 No, if my array size is too long it will take much more time, want to find a algo that will create number increasing order but randomly. – iDebD_gh Oct 20 '14 at 14:09
  • 3
    You *can* draw samples directly from the space of sorted arrays, but to do it yo would need to employ something like Metropolis-Hastings, and it will be much more expensive than just randomly sampling and then sorting. – ely Oct 20 '14 at 14:10
  • Why people are down-voting this question(just for fun :(? ) Is it not clear? is it previously asked? is it not constructive? – iDebD_gh Oct 20 '14 at 14:24
  • 1
    @iDebD_gh, `.cpp` is the extension for C++ code, not C. These are two different languages. Rename your file to `something.c` and run a real C compiler on it. – Jens Gustedt Oct 20 '14 at 14:25
  • Not my downvote, but don't think that people are doing this for fun. It is not a good fit for SO, to "ask for ideas". This is not a concrete technical question. Voting to close. – Jens Gustedt Oct 20 '14 at 14:28
  • @JensGustedt I am't telling you :)...by the way working on EMS's answer. Thanks for your reply RE:Previously it was in a cpp file, later I copied in a c for quick finding(built using GCC compiler). – iDebD_gh Oct 20 '14 at 14:31
  • What do you expect the function to do when `last` has the value `INT_MAX`? or does the function always going to use `size 10` and `%10` to insure the summation never gets that far? Otherwise your code looks fine. (note `rand()%10` imparts a small bias.) – chux - Reinstate Monica Oct 20 '14 at 14:40

3 Answers3

2

Solved back in 1979 (by Bentley and Saxe at Carnegie-Mellon):

https://apps.dtic.mil/dtic/tr/fulltext/u2/a066739.pdf

The solution is ridiculously compact in terms of code too!

Their paper is in Pascal, I converted it to Python so it should work with any language:

from random import random

cur_max=100                       #desired maximum random number
n=100                             #size of the array to fill
x=[0]*(n)                         #generate an array x of size n

for i in range(n,0,-1):
  cur_max=cur_max*random()**(1/i) #the magic formula
  x[i-1]=cur_max                  

print(x)                          #the results

Enjoy your sorted random numbers...

Caleb Fuller
  • 209
  • 2
  • 5
1

Without any information about sample size or sample universe, it's not easy to know if the following is interesting but irrelevant or a solution, but since it is in any case interesting, here goes.

The problem:

In O(1) space, produce an unbiased ordered random sample of size n from an ordered set S of size N: <S1,S2,…SN>, such that the elements in the sample are in the same order as the elements in the ordered set.

The solution:

  1. With probability n/|S|, do the following:

    • add S1 to the sample.

    • decrement n

  2. Remove S1 from S

  3. Repeat steps 1 and 2, each time with the new first element (and size) of S until n is 0, at which point the sample will have the desired number of elements.

The solution in python:

from random import randrange

# select n random integers in order from range(N)
def sample(n, N):
  # insist that 0 <= n <= N
  for i in range(N):
    if randrange(N - i) < n:
      yield i
      n -= 1
      if n <= 0:
        break

The problem with the solution:

It takes O(N) time. We'd really like to take O(n) time, since n is likely to be much smaller than N. On the other hand, we'd like to retain the O(1) space, in case n is also quite large.

A better solution (outline only)

(The following is adapted from a 1987 paper by Jeffrey Scott Vitter, "An Efficient Algorithm for Sequential Random Sampling". See Dr. Vitter's publications page.. Please read the paper for the details.)

Instead of incrementing i and selecting a random number, as in the above python code, it would be cool if we could generate a random number according to some distribution which would be the number of times that i will be incremented without any element being yielded. All we need is the distribution (which will obviously depend on the current values of n and N.)

Of course, we can derive the distribution precisely from an examination of the algorithm. That doesn't help much, though, because the resulting formula requires a lot of time to compute accurately, and the end result is still O(N).

However, we don't always have to compute it accurately. Suppose we have some easily computable reasonably good approximation which consistently underestimates the probabilities (with the consequence that it will sometimes not make a prediction). If that approximation works, we can use it; if not, we'll need to fallback to the accurate computation. If that happens sufficiently rarely, we might be able to achieve O(n) on the average. And indeed, Dr. Vitter's paper shows how to do this. (With code.)

rici
  • 234,347
  • 28
  • 237
  • 341
0

Suppose you wanted to generate just three random numbers, x, y, and z so that they are in sorted order x <= y <= z. You will place these in some C++ container, which I'll just denote as a list like D = [x, y, z], so we can also say that x is component 0 of D, or D_0 and so on.

For any sequential algorithm that first draws a random value for x, let's say it comes up with 2.5, then this tells us some information about what y has to be, Namely, y >= 2.5.

So, conditional on the value of x, your desired random number algorithm has to satisfy the property that p(y >= x | x) = 1. If the distribution you are drawing from is anything like a common distribution, like uniform or Guassian, then it's clear to see that usually p(y >= x) would be some other expression involving the density for that distribution. (In fact, only a pathological distribution like a Dirac Delta at "infinity" could be independent, and would be nonsense for your application.)

So what we can speculate with great confidence is that p(y >= t | x) for various values of t is not equal to p(y >= t). That's the definition for dependent random variables. So now you know that the random variable y (second in your eventual list) is not statistically independent of x.

Another way to state it is that in your output data D, the components of D are not statistically independent observations. And in fact they must be positively correlated since if we learn that x is bigger than we thought, we also automatically learn that y is bigger than or equal to what we thought.

In this sense, a sequential algorithm that provides this kind of output is an example of a Markov Chain. The probability distribution of a given number in the sequence is conditionally dependent on the previous number.

If you really want a Markov Chain like that (I suspect that you don't), then you could instead draw a first number at random (for x) and then draw positive deltas, which you will add to each successive number, like this:

  1. Draw a value for x, say 2.5
  2. Draw a strictly positive value for y-x, say 13.7, so y is 2.5 + 13.7 = 16.2
  3. Draw a strictly positive value for z-y, say 0.001, so z is 16.201
  4. and so on...

You just have to acknowledge that the components of your result are not statistically independent, and so you cannot use them in an application that relies on statistical independence assumptions.

ely
  • 74,674
  • 34
  • 147
  • 228
  • That's a very plausible-sounding argument but it's not correct unless you believe that no sample contains statistically independent values. Otherwise, I could use the algorithm: "Choose a sample then sort it and present it in order", and your argument would suddenly apply to the post-sorted sample; if it didn't also apply to the presorted sample, that would be very weird. Anyway, it is not difficult to generate an unbiased ordered sample; it is only difficult to do it quickly. – rici Oct 21 '14 at 03:39
  • It wouldn't apply to the separate random variables that are the components of the list of values, which would be independent. It *would* apply *after* you have sorted, because then you are treating the *ensemble* of values as *the* random variable. The OP's question is unclear. Do you want to sample from a bunch of independent things, and then present them in order (but not treat that ordered listing itself as one realization of a random thing)? Or, do you want to treat the vector as a random thing whose distribution is partly defined by this ordering property? They are different things! – ely Oct 21 '14 at 04:06
  • It's the witching hour here and my eyes aren't up to continuing this interesting discussion. You can read my answer (and linked paper) and see if you believe it provides an independent sample or not. – rici Oct 21 '14 at 04:35
  • Now that I made it through your comment, my interpretation is that the sample is to be considered a set, so that order is irrelevant and it is a single sample from the set of n-sized subsets of the universe. In this case, presenting the sample in the same order as (some canonical presentation of) the universe does not affect the sample at all. (Alternatively, the sample could be considered a representative from an equivalence class of ordered sets, with similar semantics.) That might not be what OP is asking, but it is what I answered :) – rici Oct 21 '14 at 05:06