26

I'm trying to open a web page which requires HTTP authentication, in PhantomJS. My script is based off the loadspeed.js example:

var page = require('webpage').create(),
    t, address;
page.settings.userName = "user";
page.settings.password = "password";
if (phantom.args.length === 0) {
  console.log('Usage: loadspeed.js <some URL>');
  phantom.exit();
} else {
  t = Date.now();
  address = phantom.args[0];
  page.open(address, function (status) {
      if (status !== 'success') {
          console.log('FAIL to load the address');
      } else {
          t = Date.now() - t;
          console.log('Loading time ' + t + ' msec');
          page.render('page.jpg');
      }
      phantom.exit();
  });
}

I can see from the rendered page.jpg that I'm getting a 401 every time. I've also traced the HTTP session using Wireshark, which reveals that no authentication header is sent in the GET request to the given URL.

What am I doing wrong here? I'm just getting started with PhantomJS but I've been searching all evening and not gotten far...

oberlies
  • 11,503
  • 4
  • 63
  • 110
Karl Barker
  • 11,095
  • 3
  • 21
  • 26
  • What browser? Chrome 19 just does not allow you to make a XHR setting the username and password. This was caused when they disallowed the username:password@ proportion of URLs. HTTP Auth against a different website is a tricky business. I guess I'll write a blog about this topic next weekend or so. – panzi May 29 '12 at 00:18
  • Not to do directly with this question, but I want to point out that as of PhantomJS 1.9.2 and SlimerJS 0.8.4, your authentication information (whether done with `page.settings` or `page.customHeaders`) gets sent to all 3rd party servers referenced on that page. (E.g. if the page you are logging in to uses a CDN for their jQuery then that CDN server gets your username and password; similarly for ad servers.) SlimerJS, at least, is working on a solution. – Darren Cook Oct 31 '13 at 00:57

2 Answers2

34

PhantomJS (at least as of 1.9.0) has a bug with auth: it sends the request without the auth headers, and then only after it gets the 401 back does it do the request again but this time with the headers. (That is for GET; with POST it doesn't work at all.)

The workaround is simple, so instead of:

page.settings.userName = 'username';
page.settings.password = 'password';

you can use:

page.customHeaders={'Authorization': 'Basic '+btoa('username:password')};

(I just covered this in a blog post: http://darrendev.blogspot.jp/2013/04/phantomjs-post-auth-and-timeouts.html, and learnt that workaround on the PhantomJS mailing list from Igor Semenko.)

Darren Cook
  • 27,837
  • 13
  • 117
  • 217
  • PhantomJS 1.9.2 on my machine behaves erratically. `page.settings.userName` and `password` sometimes work, sometimes does not. It's probably not related to the missing 401 stage, because I work all the time with the same remote server. With customHeader - seems to work always. – quetzalcoatl Apr 01 '14 at 12:08
  • This question is so old that I have no idea which phantomJS version I was using, but presumably it had some variation on this bug. – Karl Barker Aug 21 '14 at 08:15
  • I had to use the workaround with 1.9.12 and don't forget to include the btoa module. – Pier-Luc Gendreau Dec 08 '14 at 20:44
  • @Pier-LucGendreau I thought btoa was built-in? If something has changed, which module needs to be included for it? (P.S. Thanks for confirming it is still needed as of 1.9.12) – Darren Cook Dec 08 '14 at 20:57
  • @DarrenCook I used this module: https://www.npmjs.org/package/btoa but there's another method that doesn't require an additional dependency: http://stackoverflow.com/questions/23097928/node-js-btoa-is-not-defined-error – Pier-Luc Gendreau Dec 08 '14 at 21:41
  • @Pier-LucGendreau Both those links are for Node.js, not PhantomJS. (I haven't actually tested this script with Phantom 1.9.12, but `btoa` definitely did not need any extra modules as of 1.9.0.) – Darren Cook Dec 08 '14 at 22:50
  • Right. I assumed this question was Node! – Pier-Luc Gendreau Dec 09 '14 at 00:20
7

I dont think there is anything wrong with the script your using or phantomjs (at least in v1.5).

If you try this script:

var page = require('webpage').create(),
    system = require('system'),
    t, address;

page.settings.userName = 'test';
page.settings.password = 'test';

if (system.args.length === 1) {
    console.log('Usage: loadspeed.js <some URL>');
    phantom.exit();
} else {
    t = Date.now();
    address = system.args[1];
    page.open(address, function (status) {
        if (status !== 'success') {
            console.log('FAIL to load the address');
        } else {
            t = Date.now() - t;
            console.log('Page title is ' + page.evaluate(function () {
                return document.title;
            }));
            console.log('Loading time ' + t + ' msec');
        }
        phantom.exit();
    });
}

phantomjs loadspeed.js http://browserspy.dk/password-ok.php

The auth is successful.

David
  • 8,340
  • 7
  • 49
  • 71