3

I have a casper script that submits a form and scrapes the response.

Im trying to set up a "scraping on demand" environment where I can post the form values to a url using PhatomJS webserver, then use that data in my Casper script to scrape the page and then print out the response on the webpage. I don't see how I can pass the post variable into casper and then pass the response back to Phantom.

Heres my basic Phantom/Casper structure:

var server = require('webserver').create();

server.listen(8080, function(request, response) {

phantom.casperPath = '/source/casper/casperjs';
phantom.injectJs('/source/casper/casperjs/bin/bootstrap.js');

var address = request.post.address;

var casper = require('casper').create();

    casper.start();

casper.then(function(){
        address = // want to access global address here
    result = begin(this, address);  //Contians Casper scrape code
});

casper.run(function(){
        this.exit();
});

response.statusCode = 200;
response.write(result);  // from casper
    response.close();
});

Is there any way to access the variables from phantom in casper, and then pass data back once I finish scraping?

Jeff Ryan
  • 586
  • 2
  • 7
  • 19

1 Answers1

4

Unless you are doing something in PhantomJS that cannot be done in CasperJS, you are probably better off firing up a server in CasperJS and responding with the results of your Casper functions there.

Based on: https://stackoverflow.com/a/16489950/1096083

//includes web server modules
var server = require('webserver').create();


//start web server
var service = server.listen(ip_server, function(request, response) {

    var results;

    var address = request.post.address; // this is not working the way you would expect, needs a little help
    var casper = require('casper').create();

     casper.start(address, function() {
        // do some stuff with casper
        // store it in results
     });

     casper.then(function() {
        // do some more stuff with casper
        // store that in results too
     });

     casper.run(function() {
        response.statusCode = 200;
        response.write(results);
        response.close();              
     });

});

console.log('Server running at http://localhost:' + ip_server+'/');
Community
  • 1
  • 1
TJ Nicolaides
  • 965
  • 6
  • 11
  • There is a comment below the answer on that page that raises memory issues with this method. Have you experienced this at all? – Jeff Ryan May 27 '14 at 21:16
  • The project I was using this for never really made it past the experimental stage - so I couldn't verify it. – TJ Nicolaides May 29 '14 at 02:10