1

Hi developers I am back again with a question,

I am trying to get some data from this website https://www.iamsterdam.com/nl/uit-in-amsterdam/uit/agenda. First did I crawl the website but when doing that it came to mind that they have a api and that will be a lot faster. So I tried to get the data from the api I tried this:

get-website.js:

var webPage = require('webpage');
var page = webPage.create();
var settings = {
  operation: "POST",
  encoding: "utf8",
  headers: {
    "Content-Type": "application/json"
  },
  data: JSON.stringify({
    DateFilter: 03112016,
    PageId: "3418a37d-b907-4c80-9d67-9fec68d96568",
    Take: 2,
    Skip: 12,
    ViewMode: 1
  })
};

page.open('https://www.iamsterdam.com/api/AgendaApi/', settings, function(status) {
  console.log(page.content);
  phantom.exit();
});

get-website.php

$phantom_script= 'get-website.js'; 

$response =  exec ('phantomjs ' . $phantom_script);

echo  $response;

But what I get back doesn't look good:

Message":"An error has occurred.","ExceptionMessage":"Page could not be found","ExceptionType":"System.ApplicationException","StackTrace":" at Axendo.SC.AM.Iamsterdam.Controllers.Api.AgendaApiController.GetResultsInternal(RequestModel requestModel)\r\n at lambda_method(Closure , Object , Object[] )\r\n
etc.

Here is a picture of firebug:

enter image description here

I hope someone can help me.

N. Smeding
  • 364
  • 2
  • 14

1 Answers1

1

Interesting question. I was a bit surprised that the site would honor AJAX-request in a browser and even in cURL, but not in PhantomJS. In such cases you have to study and replicate request very carefully, because one of little details probably greatly affects the server's response.

Turned out, it was a cookie and form content-type that had to be set accordingly.

var webPage = require('webpage');
var page = webPage.create();

// courtesy of http://stackoverflow.com/a/1714899/2715393
var serialize = function(obj) {
  var str = [];
  for(var p in obj)
    if (obj.hasOwnProperty(p)) {
      str.push(encodeURIComponent(p) + "=" + encodeURIComponent(obj[p]));
    }
  return str.join("&");
}

var settings = {
    operation: "POST",
    encoding: "utf8",
    headers: {
        "accept-encoding" : "identity", // https://github.com/ariya/phantomjs/issues/10930#issuecomment-81541618
        "x-requested-with" : "XMLHttpRequest",
        "accept-language" : "en;q=0.8,en-US;q=0.6",
        "authority" : "www.iamsterdam.com",
        "accept":"application/json, text/javascript, */*; q=0.01",
        "content-type" : "application/x-www-form-urlencoded; charset=UTF-8",
        "cookie" : "website#lang=nl"        
    },
    data: serialize({
        Genre: '',
        DateFilter: '03112016',
        DayPart: '',
        SearchTerm: '', 
        Neighbourhoud: '',
        DayRange: '',
        ViewMode: 1, 
        LastMinuteTickets : '',
        PageId: '3418a37d-b907-4c80-9d67-9fec68d96568',
        Skip: 0,
        Take: 12
    }) 
};

page.open('https://www.iamsterdam.com/api/AgendaApi/', settings, function(status) {
    console.log(page.content);
    phantom.exit();
});
Vaviloff
  • 16,282
  • 6
  • 48
  • 56
  • Thx man you are a legend all I need to do now is encode it right? #mvp – N. Smeding Nov 03 '16 at 13:23
  • You're very kind :) what do you mean by "encode it"? – Vaviloff Nov 04 '16 at 01:44
  • I mean with decode instead of encode mb, because I get the response in plaintext and not as code so I except that the response is json right? – N. Smeding Nov 04 '16 at 08:20
  • No, the response is actually prerendered HTML, so you will have to parse it somehow. Actually you're not gaining much by connecting to that API endpoint directly. – Vaviloff Nov 04 '16 at 11:38