1

Using ColdFusion, my goal is to output all the elements of an array in a totally random order. My array has 6,000 items which are pulled from an XML file.

Usually, if I didn't require them to be output in a random order, I would do this:

<cfoutput>
<cfloop from="1" to="#arraylen(thestring)#" index="x">
#thestring[x]#<br/>
</cfloop>
</cfoutput>

The result is that my strings are displayed in the order they were when pulled from the XML file:

thestring[1]
thestring[2]
thestring[3]
thestring[4]

Etc. etc. all the way up to thestring[6000].

What I would like to see is all items output in totally random order:

thestring[5]
thestring[22]
thestring[4301]
thestring[201]
thestring[2041]
JPass
  • 33
  • 1
  • 8
  • 1
    Is `thestring` a regular CF array? Can you give an example of the actual versus expected results? – Leigh Nov 03 '11 at 16:01
  • I just updated it to clarify what I'm doing. My original question was not worded very good at all. Hopefully the edit helps. – JPass Nov 03 '11 at 20:16
  • In that case, Raymond's suggestion should would work perfectly. (As long as you track the used indices). Are you saying you tried it and it did not work? Perhaps your code is off. – Leigh Nov 03 '11 at 21:16

5 Answers5

2

I have used stuff like this many times with great success. I have NEVER tried this on 6,000 items though. This is one way to skin a cat.

<!--- create list of unusused indexes --->
<cfset ListOfUnusedIndexes = "">
<cfset ItemsRemaining = "#arrayLen(YourArray)#">

<!--- populate list from 1 to lenght of array --->
<cfloop from="1" to="#arrayLen(YourArray)#" index="i">
   <cfset ListOfUnusedIndexes = listAppend(ListOfUnusedIndexes , i)>
</cfloop>

<!--- loop through ---->
<cfloop from="1" to="#arrayLen(YourArray)#">
  <!--- pick a random index from 1 to the number of items remaining --->
  <cfset RandomNumber = randRange(1, ItemsRemaining)>
  <!--- get that index from the list of unused numbers --->
  <cfset IndexToGet = listGetAt(ListOfUnusedIndexes, RandomNumber)>
  <!--- output the item from your array --->
  <cfset ArrayItemToOutput = thestring[IndexToGet]>
  <!--- remove the index from your list of unusued indexes --->
  <cfset ListOfUnusedIndexes = listDeleteAt(ListOfUnusedIndexes, IndexToGet)>
  <!--- decrement the number of items remaining --->
  <cfset ItemsRemaining = ItemsRemaining - 1>
</cfloop>
Evik James
  • 10,335
  • 18
  • 71
  • 122
  • Although I like your algorithm, Leigh showed that list operations are highly inefficient. – Henry Nov 03 '11 at 22:47
  • Everyone knows that list operations are less efficient, hopefully. The question asked for a solution, not the most processor or code efficient solution. I can't even fathom a real life scenario in which I would need to equally randomize an array with 6k elements. Thanks for admiring the algorithm. – Evik James Nov 04 '11 at 02:44
  • Evik, for some reason this code outputs the same items multiple times. Sometimes the same item was ouptut 2 or 3 times, sometimes just 1 time – JPass Nov 04 '11 at 15:22
  • This is only pseudo-code. I didn't test it. I believe the algorithm would work well if you tweaked it a little, but would be still be extremely inefficient for 6k records. I would highly recommend using the other solutions and not mine. I provided mine for the sole purpose of amusement. :) – Evik James Nov 04 '11 at 15:33
1

Update: This was more of a quick demo to demonstrate an earlier suggestion would work . But a few of the other suggestions are more streamlined. So I would recommend using one of them instead.

Based on your latest update, I see no reason Raymond's suggestion would not work. As long as you keep track of the elements already displayed. Here is a simple example

<cfscript>
    // generate sample array
    yourArray = [];
    for (x = 1; x <= 20; x++) {
            arrayAppend(yourArray, "item "& x);
    }

    // stores used indices
    visited = {};
    for (x = 1; x <= arrayLen(yourArray); x++) {

            // get a random index we have not seen before
            do {
                    index = randRange(1, arrayLen(yourArray));
            }
            while (structKeyExists(visited, index));

            // display it
            WriteOutput(yourArray[index] &"<br>");

            // mark are "visited" so we do not display it again
            visited[index] = true;
    }
</cfscript>
Leigh
  • 28,765
  • 10
  • 55
  • 103
  • No it is actually quite efficient in comparison to list functions. An array of 6000 items took approximately ~187ms versus ~15000ms using list functions. – Leigh Nov 03 '11 at 21:57
  • I can see your algorithm would be great if the code does not need to randomize all 6000 items in a go – Henry Nov 03 '11 at 22:09
  • Not sure I follow. That is exactly what it *does* do. Output the array in a random order. – Leigh Nov 03 '11 at 23:50
  • Side note, try not to delete comments after they have related responses. It makes it hard to follow the thread of the conversation. – Leigh Nov 04 '11 at 00:04
  • maybe he just needs to show 100 per page or so from `thestring` array, then your algorithm would work better then shuffling everything at one go. – Henry Nov 04 '11 at 00:34
  • Ah, gotcha. (More characters to force SO to accept this comment). – Leigh Nov 04 '11 at 01:23
  • Hi Leigh, I changed my chosen answer from yours to Evik's. Not because your solution did not work, it did work perfectly. I'm just not used to using the CFSCRIPT tag so I went with Evik's answer below in order to make it easier for me to do some additional processing during the output. – JPass Nov 04 '11 at 14:58
  • After doing some testing I decided your answer works best and I will use it after all. With a little tinkering I was able to figure out how to add some html that I needed appended to the output. Thanks. – JPass Nov 04 '11 at 15:19
  • @JPass, have you tried my solution? It's not in CFSCRIPT. :) Also, you should cache `arrayLen(yourArray)` if you're using CFSCRIPT. – Henry Nov 04 '11 at 17:14
  • @JPass, are you going to display 6000 items like that? Imagine after you've displayed 5999 items, you are still relying on `randRange(1,6000)` to find the last item that has not been `visited`. This is not a very efficient algorithm. – Henry Nov 04 '11 at 17:42
  • @JPass - You can always rewrite in cfml if you prefer ;) Keep in mind it is just an outline. There are several examples to choose from, with different approaches, code styles and varying levels of efficiency. (Personally, I think Scott's rewrite is better than mine). But do not fall into the "premature optimization pit" ;) Try them all and see which approach is the best fit for your application. – Leigh Nov 04 '11 at 18:29
1

based on Non repeating random numbers , Fisher-yates shuffle algorithm is what you want and it has already been implemented in java.util.Collections's shuffle(), see: Java's Collections.shuffle is doing what?

<cfset indices = []>
<cfset arrayResize(indices, 6000)>
<cfloop from="1" to="6000" index="i">
    <cfset arrayAppend(indices, i)>
</cfloop>

<cfset collections = createObject("java","java.util.Collections")>
<cfset collections.shuffle(indices)>

<cfloop array="#indices#" index="x">
    #thestring[x]#<br/>
</cfloop>

Very efficient. 6000 items only takes ~3ms using a quick getTickCount() test on my PC.

Community
  • 1
  • 1
Henry
  • 32,689
  • 19
  • 120
  • 221
  • I agree it should work. Coincidentally that is the approach they were originally using, but said it did not work. I do not know why .. @JPass - could you elaborate on what was not working? – Leigh Nov 03 '11 at 23:51
  • maybe it did not work because the 'array' is actually a list in CF XML object and although it behaves like an array, it is not mutable by `java.util.Collections` – Henry Nov 04 '11 at 00:37
  • Yep, that is what I was thinking earlier when I asked him/her if the source array was a *CF array*. But did not get a clear response yet. – Leigh Nov 04 '11 at 01:12
  • This works for me, but the code can be streamlined a little bit. 1. No Need for the arrayResize(). 2. You could simply call collections.shuffle() and pass in the original array. 3. Change the cfloop to loop over the original array and simply output the index of each element. Will Post example code. – Scott Stroz Nov 04 '11 at 12:40
1

The potential problem with tracking the 'used' items is that there may be many more iterations than needed to output all the items. For example, lets say element 10 was the first item displayed, by tracking it as used, its possible that 10 will come up in randrange() again before all items have been displayed. The more items that are 'used' the higher the likelihood that we will have to add an iteration to get an element that was not used.

One way we can make sure we only make 6000 iterations is to reduce the size of the array with each iteration and do randRange( 1, arrayLen({array name}).

//set up array
test = [];
for( x=1; x<=6000; x++ ){
    arrayAppend( test, "Item " & x );

}

//here is the meat of the process
done = false;
count = 1;
while( !done ){
    index = randRange( 1, arrayLen( test ) );
    writeOutput( count & " : " & test[ index ] & "<br/>" );
    arrayDeleteAt( test, index );
    count++;
    done = !arrayLen( test );
}

The code above was averaging under 100ms to execute on my development machine.

Also, after testing out Henry's code, I noticed that it could be tweaked a little, here is the slightly streamlined code.

<!--- set up the array --->
<cfset theString = []>
<cfloop from="1" to="6000" index="i">
    <cfset arrayAppend(theString, "Item : " & i) />
</cfloop>

<cfset collections = createObject("java","java.util.Collections") />
<cfset collections.shuffle( theString ) />
<cfoutput>

<cfloop array="#theString#" index="x">
    #x#<br/>
</cfloop>
</cfoutput>

Notice I got rid of the arrayresize() and that you can simply pass the original array into collections.shuffle() and then loop over that.

Scott Stroz
  • 7,510
  • 2
  • 21
  • 25
  • :) why get rid of `arrayresize()`? It is there for a reason, look up `ensureCapacity` in java's Vector. Also, he could not pass the original array into `collections.shuffle()` 'cause it couldn't be manipulated by `Collections` 'cause it is an array in CF's XML object – Henry Nov 04 '11 at 17:11
  • @Scott - Yep, I agree my original code could be optimized even further. At the time JPas said neither `shuffle()` or `randRange(...)` with an array worked - without explaining *why*. So it was more of a quick demo to demonstrate one way that it could work. Still, even with a few extra iterations it performed pretty well ~186ms. – Leigh Nov 04 '11 at 17:22
  • .. and +1 because I prefer yours over mine. – Leigh Nov 04 '11 at 20:05
  • I removed the arrayResize() because the code works fine without it and I loathe including code that is unnecessary. Note, all I did was take your example and try to make it more efficient. – Scott Stroz Nov 06 '11 at 13:43
  • Yep. I like your end product better and would use it over mine. – Leigh Nov 06 '11 at 21:44
0

Loop from 1 to N, with N being the number of times you want to loop, and then simply pick a random element using randRange(1, arraylen(thestring)).

Raymond Camden
  • 10,661
  • 3
  • 34
  • 68
  • Your solution would work for my poorly worded original question. I updated the question to clarify what I'm looking for. Thanks. – JPass Nov 03 '11 at 20:19