0

note: I may be using the word recursively wrong

I'm trying to create a script that runs through a long string of text, searching for the recurrence of substrings. For this I essentially do a for-loop from 12 to 3 (lengths of strings I want to check for recurrence), and for each of those lengths I use another for loop to define if the string of that length is present multiple times. A dumbed down example (untested, just for demonstration):

for(var len=12; len>3; len--) {
    var recurring = [];
    for(var pos = 0; pos < (inputString.length - len); pos++) {
        var mystring = inputString.substr(pos, len);
        // do a regex to check if mystring occurs more than once
        // if so, push it into recurring and remove from inputString to prevent partial duplicates when mystring gets shorter
    }
    // output recurring strings
 }

This all works, but obviously when I run this on a very long inputString it gets very slow. In fact, it freezes the browser, and any progress output that I want to show while the script runs is paused and shows up all at once when the script is done.

To prevent this, I created a function that is supposed to run through the for-loop in small batches at a time, using a callback function to proceed to the next batch. By using a basic version of this, I was able to output progress onto the page during what used to be a for-loop, even if it's going through thousands of loops:

var fragmentedFor = function($aFragmentLength, $aTotal, $aFunction, $aCallback) 
{
    var $lStart = 0, 
        $lFragmentLength = $aFragmentLength,
        $lTotal = $aTotal,
        $lFunction = $aFunction,
        $lCallback = $aCallback;

    // simulate a for loop, but in fragments to prevent the browser from freezing
    (function forLoop ($aStart)
    {
        // run x loops at a time
        var $lEnd = $lStart + $lFragmentLength;

        // can't run the loop past the total length
        if($lEnd > $lTotal ) 
            $lEnd = $lTotal;

        // the fragmented for loop
        for(var $i = $lStart; $i < $lEnd; $i++) {
            // run the function here    
            $lFunction({count: $i});
            }

            // next time, start from where we just left
        $lStart += $lFragmentLength;

        // is there room for more loops?
        if($lStart < $aTotal) {
            setTimeout(forLoop, 10);
        } else { 
            // if not, run the callback
            $lCallback();
        }
    })();
}

When I run this just once it seems to do what I want:

function myLoop(x) {
    new fragmentedFor(
        10,  // amount of loops per run
        x,  // total amount of loops
        function($aData) { console.log($aData.count); },  
        function() { console.log('finished') }
    );
}
myLoop(1000);

However, when I want to call this one nested within another instance of such a loop, I run into trouble:

new fragmentedFor(
    1,
    5,
    function($aData) { myLoop($aData.count * 100) },
    function() { console.log('finished all') }
)

(I have copied my current files here: http://files.litso.com/vigenere/so/, which includes the fragmentedFor function so you can essentially paste these two functions in the console and see the result. This was a lot easier than putting it on jsfiddle and making that work, if you guys don't mind)

What appears to be happening is that the first run of myLoop is started, but since it's trying to run 0 * 100 times it ends safely, going to the second run. The second run of myLoop should go up to 100, but it runs to 19 before suddenly run 3 of myLoop kicks in.

From there on, the output seems to be totally mixed up, I figure because my callbacks are implemented incorrectly and the loops are not really waiting for eachother to finish.

What am I doing wrong here? How can I even start debugging where the problem lies, and how can I make sure that the independent runs of the for loop actually wait for eachother to finish?

-edit-

Here's a jsfiddle of an older working copy: http://jsfiddle.net/u2aKX/ This did not incorporate the fragmentedFor loop, it does have some callback functions which seemed to improve the performance compared to regular nested for-loops by 100%.

Stephan Muller
  • 27,018
  • 16
  • 85
  • 126
  • another note: I've been thinking of replacing the outer fragmentedFor() by a function that relies on custom events, starting the next round of myLoop by firing an event within when it's finished. This is easy with jQuery but honestly I just want this to work like I'm trying to do – Stephan Muller Jun 01 '13 at 19:18
  • i suggest you improve the original nested loop and not complicate it beyond necessary. can you publish the original on jsfiddle? – muratgu Jun 01 '13 at 19:24
  • I'll try to put it on jsfiddle, thanks. The problem with the original loop is that there's no way to prevent the browser from freezing other than this solution, which I actually took from another SO question: http://stackoverflow.com/questions/2995805/show-javascript-execution-progress/2995900 (answer by Juriy) – Stephan Muller Jun 01 '13 at 19:27
  • linked solution suggests using `setTimeout`. have you tried it? – muratgu Jun 01 '13 at 19:33
  • setTimeout is actually present in my fragmentedFor. I've tried multiple ways to use it, but this construction seemed the only way that really did what I need – Stephan Muller Jun 01 '13 at 19:39
  • You should remove all those dollar signs for readabiltiy… and there's no reason to distinguish between local vars and arguments - they're both variables in the same scope and don't need initialisation. – Bergi Jun 01 '13 at 19:43
  • Sorry, these are the coding standards I'm used to from my job, whether or not they are useful is an ongoing discussion but I can't really help but write this way. Regarding the local vars, I assumed that they might collide, but it's good to know that porting from arguments ($aData) to local ($lData) won't be necessary. – Stephan Muller Jun 01 '13 at 19:43
  • Why don't you tokenize the long string-to-be-searched and see if each token appears in your list and keep track of the number of hits? – Tim Jun 01 '13 at 19:56
  • I'm not sure what you mean by tokens? – Stephan Muller Jun 01 '13 at 20:32

1 Answers1

1

I figure because my callbacks are implemented incorrectly and the loops are not really waiting for eachother to finish.

What am I doing wrong here?

Your fragmentedFor expects the $aFunction to be synchronous, and as soon as it returns it schedules the next loop turn.

Yet, by using a nested fragmentedFor in that function, you're making it asynchronous - and the iterations of the inner loops will start to crossfire.

As you recogniced, you will have to let them wait to finish - which means hooking the next iteration on the callback of the previous one. You will have to write another fragmentedFor that deals with asynchronous iteration steps:

function fragmentedForAsync(from, to, body, callback) {
    var i = from;
    (function loop() {
        if (i < to)
            body(i++, loop); // pass itself as callback
        else
            callback();
    })();
}

and then use it like this:

fragmentedForAsync(1, 5, function(i, callback) {
    fragmentedFor(10, i*100, function(j) {
        console.log(j);
    }, function() {
        console.log('finished '+1);
        callback(); // next turn of outer loop
    });
}, function() {
    console.log('finished all')
});
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • Ah, i had a hunch that something like this was the case. I'm currently unable to try this but your logic is sound and it looks like a good solution. I'll try it out as soon as I'm at my desktop pc, thanks! – Stephan Muller Jun 01 '13 at 20:35