1

I'm using a while loop to open a list of usernames from a csv file. For each username of these, I have to open a URL and dump the page into a file.

However, then casper.thenOpen always runs only one time. I understood from Asynchronous Process inside a javascript for loop that this is due to that it is an asynchronous process. I need to do the same for my code below:

casper.then(function(){
    stream = fs.open('usernames.csv', 'r');
    targetusername = stream.readLine();         
    i = 0;

    while(targetusername) {                 
        var url = "http://blablalb" + targetusername;       
        console.log("current url is " + url);

        casper.thenOpen(url, function() {
            console.log ("I am here");
            fs.write(targetusername,this.getTitle() + "\n",'w');        
            fs.write(targetusername,this.page.plainText,'a');       
        });

        targetusername = stream.readLine();
        i++;
    }

});

The casper.thenOpen always runs only one time, giving me this output:

current url is first_url
current url is second_url
current url is third_url
I am here

What I need is like this

current url is first_url
I am here
current url is second_url
I am here
current url is third_url
I am here

I'm pulling my hair out to get that while loop running correctly!

Community
  • 1
  • 1
Ahmed Tawfik
  • 363
  • 4
  • 13

2 Answers2

2

I think there is nothing wrong with that code. I write this code for test(basically, it's the same as your code):

var casper = require('casper').create();

var url_list = [
    'http://phantomjs.org/',
    'https://github.com/',
    'https://nodejs.org/'
]

casper.start()

casper.then(function () {
        for (var i = 0; i < url_list.length; i++) {
            casper.echo('assign a then step for ' + url_list[i])
            casper.thenOpen(url_list[i], function () {
                casper.echo("current url is " + casper.getCurrentUrl());
            })
        }
    }
)

casper.run()

Output:

assign a then step for http://phantomjs.org/
assign a then step for https://github.com/
assign a then step for https://nodejs.org/
current url is http://phantomjs.org/
current url is https://github.com/
current url is https://nodejs.org/en/

As you see, it opened every url.


So let's answer your questions:

Q1: why it is doesn't output as this:

current url is first_url
I am here
current url is second_url
I am here
current url is third_url
I am here

A1: Because CasperJS assign steps first, more precisely, push steps to a stack, and then pop step from that stack, then run that step. Take a look at that great answer for further information.

Q2: Why it doesn't output as (why the loop run only 1 time):

current url is first_url
current url is second_url
current url is third_url
I am here
I am here
I am here

A2: You may meet some exceptions in opening second url and PhantomJS crashes. This code may help you to see what happens:

var casper = require('casper').create({
    verbose: true,
    logLevel: "debug",
}); //see more logs

casper.on('error', function (msg, backtrace) {
    var msgStack = ['PHANTOM ERROR: ' + msg];
    if (backtrace && backtrace.length) {
        msgStack.push('TRACE:');
        backtrace.forEach(function(t) {
            msgStack.push(' -> ' + (t.file || t.sourceURL) + ': ' + t.line + (t.function ? ' (in function ' + t.function +')' : ''));
        });
    }
    this.log(msgStack.join('\n'), "error");
});// watch the error event which PhantomJS emits
Community
  • 1
  • 1
Sayakiss
  • 6,878
  • 8
  • 61
  • 107
  • Thank you for your reply, but do you have any idea why my code is giving that output?? I even replace the while loop with a for loop and same output was shown – Ahmed Tawfik Jun 02 '16 at 08:04
  • @SoCRaT As I said: `You may meet some exceptions in opening second url and PhantomJS crashes`, so did you try my code for collecting my error logs? There is nothing to do with your loops (`while` or `for` doesn't matter). – Sayakiss Jun 02 '16 at 09:25
  • @SoCRaT Do you still have problems about my answer? – Sayakiss Jun 03 '16 at 04:13
  • Hi, I could solve it to produce exactly the result I need using the repeat function. I will post it right away. Thank you very much for your help. – Ahmed Tawfik Jun 03 '16 at 21:04
1

I could reach the exact output that I needed:

current url is first_url
I am here
current url is second_url
I am here
current url is third_url
I am here

using the repeat function, as follows:

casper.then(function(){
    stream = fs.open('usernames.csv', 'r');        

    casper.repeat(3, function() {

        targetusername = stream.readLine(); 
        var url = "http://blablalb" + targetusername;       
        console.log("current url is " + url);

        casper.thenOpen(url, function() {
            console.log ("I am here");
            fs.write(targetusername,this.getTitle() + "\n",'w');        
            fs.write(targetusername,this.page.plainText,'a');       
        });

    }

)});
Ahmed Tawfik
  • 363
  • 4
  • 13