0

I'm writing a userscript that needs to import/read from a large text file.

The text file is several GB in size, and if using split('\n') would create an array with millions as an array length.

You're going to say 'Oh but why do you need to process so much data'. I do need to, but I don't need to store it in an array. I accept that.

When running the userscript under firefox, it'll hang if the array that stores the text file has a length over about 50,000. That's fair enough.

Clearly I need a way to iterate over a large text file line by line ('\n') without actually storing the entire file in a string and then an array. I suspect Javascript isn't going to be great for this, but as this is a userscript I'm not sure what my other options are.

Not that my code really matters, but here's what I'm doing in a nutshell. This is a research project and nothing sinister. The below is more or less pseudo code.

var wordListString = GM_getResourceText("wordlist");
var wordListArray = wordListString.split('\n');

var username = "TestUsername";
$userField.val(username);

for (var i = 0; i < wordListArray.length; i++)
{
    (function(o) {
        setTimeout(function()
        {
            if ($userField.val() != username)
                $userField.val(username);
            $passwordField.val(wordListArray[o]);
            setTimeout(function() {$loginButton.click()}, 50);
        }, o * 1000);
    })(i);
}
Edge
  • 2,456
  • 6
  • 32
  • 57
  • You should do it by batches, instead of spliting whole text, split a substring, process it, then get next set of lines, process it, and so on. Can make a fiddle to illustrate example – juvian Mar 30 '14 at 00:48
  • 1
    It's just me or it's a bruteforcer ? – Biduleohm Mar 30 '14 at 00:49
  • Hahaha yeah, just read the code – juvian Mar 30 '14 at 00:50
  • I'm not sure how splitting the resource file within the script would work without physically splitting the resource files. – Edge Mar 30 '14 at 01:00
  • Yes, it's a dictionary attack, but it's targetted at a small website I setup and own. I mean honestly, most proper sites have lockouts after a few wrong tries anyway. I'm doing this on the side of a CS project. – Edge Mar 30 '14 at 01:06
  • True, how are you planning to store gb of data in a single wordListString variable? – juvian Mar 30 '14 at 01:09
  • That's what I wish to get around. How can one process a text file line by line without storing the whole thing? – Edge Mar 30 '14 at 01:12
  • Best I found was this, check if it suits your needs: http://stackoverflow.com/questions/17472313/using-javascript-filereader-with-huge-files – juvian Mar 30 '14 at 01:23
  • That may be a good idea. Thankyou Juvian. – Edge Mar 30 '14 at 01:31
  • Make a simple webpage, on your server, that serves up the results a page at a time, based on URL parameter. Your script would then: (1) use `GM_xmlhttpRequest()` to get each page, (2) use those values to assault your test site, (3) repeat as necessary. ... You'd be better off doing this in Python (or similar) a userscript is a crude, slow, brittle tool for this kind of thing. – Brock Adams Mar 30 '14 at 05:42
  • How would I go about serving up results a part at a time with URL discrimination? – Edge Mar 30 '14 at 07:32

0 Answers0