0

Say I am generating random numbers using Math.random(), maybe 1000 of them, and I want to have them in ascending order (at all times). Is there an algorithm of some variety that can keep them sorted the whole time, without ever having to call a sort routine? The only thing I can think of is a BST? but there might be a better way.

Some code will help:

const numContainer = {};

for(let i = 0; i < 1000; i++){
   const r = Math.random();  // I generate a new RV
   numContainer[r] = {r};   // I want to store it in order, but this isn't helping :-)
}

clearly the above is not really going to maintain any numerical order for the keys etc. I am looking to have them be sorted as I go.

Update 1: I realize the use-case might be good to know (or interesting). The use-case is a discrete-event-simulation, the smaller the random uniform variable the sooner the event, so I need to read the events in numerical order so it would be nice if they were sorted, naturally, instead of requiring a sort or what not.

Update 2: I realized binary search (with an array), won't really be much good since I will have to insert the new item and that insert will end up being O(n), so back to square one. So my guess is just some sort algorithm that works with an already sorted array, will be fastest?

Alexander Mills
  • 90,741
  • 139
  • 482
  • 817
  • 1
    I would use a binary tree. Every time u have a new number u will add it on the proper leaf and in this way all your numbers are ordered, but not in the traditional array way. – D A Nov 28 '22 at 10:29
  • 1
    B-tree, B+-tree, AVL, red-black, skiplist, ... Take your pick. – trincot Nov 28 '22 at 11:51
  • booof I wonder if there is a linear way – Alexander Mills Nov 28 '22 at 11:55
  • 1
    Can you be more specific what you're doing? What do you want to do with the sorted random numbers? Note that sorting 1000 random numbers 1000 times takes approximately no time, so it's very hard to find an "optimized" solution that has any real-world impact. – Paul Hankin Nov 28 '22 at 12:15
  • Does this answer your question? [Random numbers external sort](https://stackoverflow.com/questions/46058304/random-numbers-external-sort) – Dave Nov 28 '22 at 12:26
  • If by "a linear way" you mean a way to sort n elements in O(n) time, then, no there is not. – kaya3 Nov 28 '22 at 12:35
  • @trincot you can generate n sorted random elements in O(n) time. See Generating Sorted Lists of Random Numbers by Bentley & Saxe (https://pdfs.semanticscholar.org/2dbc/4e3f10b88832fcd5fb88d34b8fb0b0102000.pdf) – Dave Nov 28 '22 at 12:37
  • Why are you telling me? It is not my question. – trincot Nov 28 '22 at 12:41
  • I updated OP to make it a little clearer, hopefully the code example helps – Alexander Mills Nov 29 '22 at 16:28
  • It might help if you elaborated on what result you need - examples of *acceptable* and *not acceptable*? Would `previous + Math.random()/1000` do? – greybeard Dec 06 '22 at 08:18
  • @greybeard I just need a sorted list, I need to read from front/back only (at this time), but I guess I need to "insert in the middle". So far doesn't seem possible, I suppose it might require a special computer architecture to do this, but maybe it's possible (beyond binary search). – Alexander Mills Dec 06 '22 at 08:34

6 Answers6

3

You can generate your numbers in sorted order in linear time. The paper describing how to do this is: Generating Sorted Lists of Random Numbers by Bentley & Saxe

https://pdfs.semanticscholar.org/2dbc/4e3f10b88832fcd5fb88d34b8fb0b0102000.pdf

/**
 * Generate an sorted list of random numbers sorted from 1 to 0, given the size
 * of the list being requested.
 * 
 * This is an implementation of an algorithm developed by Bentley and Sax, and
 * published in in ACM Transactions on Mathematical Software (v6, iss3, 1980) on
 * 'Generating Sorted Lists of Random Numbers'.
 */
public class SortedRandomDoubleGenerator {
    private long       valsFound;
    private double     curMax;
    private final long numVals;

    /**
     * Instantiate a generator of sorted random doubles.
     * 
     * @param numVals the size of the list of sorted random doubles to be
     *        generated
     */
    public SortedRandomDoubleGenerator(long numVals) {
        curMax = 1.0;
        valsFound = 0;
        this.numVals = numVals;
    }

    /**
     * @return the next random number, in descending order.
     */
    public double getNext() {
        curMax = curMax
                * Math.pow(Math.E, Math.log(RandomNumbers.nextDouble())
                        / (numVals - valsFound));
        valsFound++;
        return curMax;
    }
}
Dave
  • 7,460
  • 3
  • 26
  • 39
  • 1
    ^^ The code is for descending order, but you can easily modify. E.g., return (1.0 - curMax) instead of returning curMax. I posted it as an answer to a different but quite similar question here: https://stackoverflow.com/questions/46058304/random-numbers-external-sort. Unsure in this case whether it's better to close as duplicate or post an answer. – Dave Nov 28 '22 at 12:35
  • this looks good, will try to digest it – Alexander Mills Nov 28 '22 at 22:49
  • sadly not sure this does what I want it to do, I get output looking like this: https://gist.github.com/ORESoftware/3be65658851cdd909b5a9fab95f4cf87 I – Alexander Mills Nov 29 '22 at 17:08
  • @AlexanderMills I've used this code and it worked perfectly. Can you post your code? – Dave Nov 29 '22 at 21:31
  • well, I don't think my situation allows me to generate them in sorted order to begin with (because they are random and come from different sources), I need to handle an unordered set, and then put them into an ordered structure. – Alexander Mills Nov 30 '22 at 01:33
  • 1
    @AlexanderMills This should work for you as long as you know how many events you need to generate up front (numVals in the code above). With that, it will generate that many numbers sorted but randomly distributed. If you don't know how many you need up front then this won't help. – Dave Nov 30 '22 at 02:10
  • 1
    @AlexanderMills, you should edit your question to clarify how you get the random input. The code in your question makes us believe that you **generate** the random numbers yourself and know upfront how many of them you are planning to generate. If this is not the actual case, you really need to clarify that in your question. – trincot Dec 06 '22 at 18:30
  • I am am generating random numbers sequentially, it could take on any value high or low, maybe even negative. In OP it just talks about Math.random() but I don't see why it would matter, do you think it would matter? – Alexander Mills Dec 06 '22 at 19:20
2

A heap is only partially ordered: nodes are only ordered with respect to their parents. Perhaps you were thinking of a binary search tree.

A typical implementation of a self-balanced binary search tree is the red-black tree, and it can be implemented using an array with some empty spaces (at most n-2 if you have n elements).

The whole operation will still run in the same amount of time, complexity-wise, as just populating an array and sorting it at the end, although I would bet the latter option is almost always faster.

Nelfeal
  • 12,593
  • 1
  • 20
  • 39
1

You did not specify any constraint on the data structure, so an elementary solution is an array, combined with insertion of the new elements.

Or a linked list.

Or a binary search tree.

But not a heap !

1

As the others who commented have said, there is not really an "automatic" way of doing this in linear time, but i've been using something similar to the following implementation to maintain an ordered realtime orderbook and it has been plenty fast (and contains way more than 1000 values) so far. I just made it a "special" type of array-object, the actual implementation can of course be changed any way you like, i just hijacked the Array.push() method here for a binaryInsert and wanted the other array methods to be available as well, not sure if that's what you're looking for, but maybe it helps somehow:

var data = Array.from({length: 1000}, () => Math.random() * 20);

var sortedArray = Object.create(Array.prototype, {
    compare: {
        value: (a, b) => a - b
    },
    push: {
         value: function(...items) {
            items.forEach((item) => {
                let low = 0;
                let high = this.length;
                
                while (low < high) {
                    const mid = (low + high) >> 1;
                    this.compare(this[mid], item) > 0
                        ? (high = mid)
                        : (low = mid + 1);
                }
            
                this.splice(low, 0, item);
            });

            return this.length;
        }
    }
});

// you can push your random values individually
//const s = performance.now();
//data.forEach((rand) => sortedArray.push(rand));
//const e = performance.now();
//console.log(`individually inserted ${data.length} items sorted in ${e-s}ms`);

// or in bulk, like normal .push()
const s = performance.now();
sortedArray.push(...data);
const e = performance.now();
console.log(`batch inserted ${data.length} items sorted in ${e-s}ms`);

// you can do anything you can with a regular Array, like
console.log('filtered', sortedArray.filter((value) => value >= 10 && value <= 11));

console.log(sortedArray);
exside
  • 3,736
  • 1
  • 12
  • 19
1

The short answer is that, no, you can't end up with a continuously sorted dense array in constant time.

The closest thing that comes to mind is kind of a cross between a Radix sort and a hash table.

Start with an array of 10000 null entries, say, and slot each value into slot # floor(10000*value), where value is a random number from [0-1).

Inserting is O(1), just like a hash table. And everything is sorted... But it's sparse.

You have to expect hash collisions, of course, so each slot would either be a linked list of values or you could "bubble" consecutive values forward in the array to keep it sorted.

That would give you roughly O(n) for inserts, though O(m) for iterating through the list, where m is the total hash table size.

Not sure if this is useful for your goal here, but it seemed like a different approach that someone might find useful.

SomeCallMeTim
  • 4,503
  • 2
  • 28
  • 27
  • what is n in comparison to m? – Alexander Mills Dec 14 '22 at 07:06
  • The n value is the number of floats in the array. The m value is the number of slots in the hash table. If you really want exactly 1000 values, you could set the size of the hash table to 1000, but you'd almost certainly want to use a linked list at each location in the table at that point, and then you're losing speed because of cache misses. Probably better to have the hash table have, say, 1400 floats or so, using the "next slot overflow" method, with "null" being -1 or Nan. – SomeCallMeTim Dec 14 '22 at 16:41
0

@Alexander Mills. If I well understand your statement "So my guess is just some sort algorithm that works with an already sorted array, will be fastest?", I think you're close to the solution, which is an array or table of queues sorted in priority order. One can't avoid prioritizing the discrete events. If the discrete events are not prioritized, it's a typical "Last In, First Serve" situation! It looks like you're trying to simulate a multithreaded system. Am I right? To answer your question "Is there an algorithm of some variety that can keep them sorted the whole time, without ever having to call a sort routine?": if you're looking for a background sorting, the only way is a Thread!

SudoKoach
  • 366
  • 1
  • 6
  • `a typical "Last In, First Serve" situation`? [`I need to read from front/back only (at this time)`](https://stackoverflow.com/questions/74599147/algorithm-to-maintain-a-sorted-numerically-list-of-numbers-doubles/74702700#comment131842016_74599147) – greybeard Dec 06 '22 at 13:13