0

As it happens that skyscanner only provides their api to big commercial websites, I wanted to build a small application on my own to retrieve the results for multiple destinations for my own purpose (non commercial).

I found that getting the result of a flight search seems to be pretty difficult as the page takes a few seconds to complete the flight search and display the results.

Using wget, lynx, links2 or edbrowse didn't work for me, as I got the result that javascript is not enabled in my browser, even when links2 was compiled with javascript support. Maybe I did something wrong, I don't know.

However phantomjs provided the best effort so far and I tried multiple code-fragments to retrieve the flight search results.

Sources from:

[Stackoverflow#1][1]
[Stackoverflow#2][2]
[Techslides][3]
[Stackoverflow#3][4]
[Stackoverflow#4][5]

  [1]: http://stackoverflow.com/questions/18526140/how-to-get-html-generated-from-javascript-using-phantomjs
  [2]: http://stackoverflow.com/questions/28209509/get-javascript-rendered-html-source-using-phantomjs
  [3]: http://techslides.com/grabbing-html-source-code-with-phantomjs-or-casperjs
  [4]: http://stackoverflow.com/questions/12450868/how-to-print-html-source-to-console-with-phantomjs
  [5]: http://stackoverflow.com/questions/8692038/phantomjs-page-dump-script-issue

Even with the time lag described in [Stackoverflow#4][5] it did not work. The scripts resulted (in case of a successful return) only an error page of skyscanner, saying that they got a problem.

The last effort I tried which resulted in the described error-page was:

var page = new WebPage(),t, address;
var fs = require('fs');

var url = 'http://www.skyscanner.at/transport/fluge/nyca/lax/150626/150627/flugpreise-von-new-york-nach-los-angeles-international-im-juni-2015.html?adults=1&children=0&infants=0&cabinclass=economy&rtn=1&preferdirects=false&outboundaltsenabled=false&inboundaltsenabled=false';

address = encodeURI(url);
page.open(address, function (status) {
    if (status !== 'success') {
        console.log('FAIL to load the address');
    } else {
        f = null;
        var markup = page.content;
        console.log(markup);
        try {
        f = fs.open('htmlcode.txt', "w");
        f.write(markup);
        f.close();          
        } catch (e) {
            console.log(e);
        }
    }   
    phantom.exit();

});

Did someone try something like that before and was successful? How did you get it working? I am trying to build a php-based and/or shell-script based solution on a gui-less Debian-Linux system.

Kevin
  • 233
  • 1
  • 2
  • 8

1 Answers1

1

I work in engineering at Skyscanner. This isn't an answer to your question but, if you end up on that error page (or a captcha page), it is likely that our bot-blocker is catching you. Which is kind of "by design" :)

I can get you an API key, with a conservative rate limit. Would that be of interest?

Cheers,

Iain

iain
  • 1,693
  • 13
  • 19
  • Hello lain, an api key would of course be of interest. I also sent a mail to skyscanner where I described my project and what I wanted to do with the api key. As described it's for private use only. If you could get me an API key this would be the very best solution. Thank you in advance. Cheers Kevin – Kevin Jul 03 '15 at 12:12
  • Cool, can you drop me an email iain.wilson[at]skyscanner.net and I will send you a key. – iain Jul 08 '15 at 07:29