1

Below is my code, currently this works fine.. but i want to optimize it to not load / download some resources like (fonts, images, css, js).. I've read the api docs but i'mnot able to find the related configs.. Well, I'm using webdriverIO and phantomjs as browser..

'use strict';

var _ = require('lodash');
var webdriverio = require('webdriverio');
var cheerio = require('cheerio');

/**
 * Base class for browser based crawler.
 * To run this crawler you need to first run phantomJS with webdriver on localhost
 * ```
 * ./phantomjs --webdriver 4444
 * ```
 */
class BaseWebdriverIO {
  /**
   * Constructor
   * @param opts - webdriverio config http://webdriver.io/guide/getstarted/configuration.html
   */
  constructor(opts) {
    this.opts = _.defaults(opts || {}, {
      desiredCapabilities: {
        browserName: 'phantomjs'
      }
    });
  }

  /**
   * webdriver and parse url func
   * @param parseUrl
   * @returns {Promise}
   */
  parse(parseUrl) {
    console.log("getting url", parseUrl);

    return webdriverio.remote(this.opts)
      .init()
      .url(parseUrl)
      .waitForVisible('body')
      .getHTML('body', false, function(err, html) {
        if (err) {
          throw new Error(err);
        }

        this.end();
        return cheerio.load(html);
      });
  }
}

module.exports = BaseWebdriverIO;

I'm not able to find any documentation related this. Can anyone tell me, How can I do that?

Edit/Update: I've found a working example which optimize images to not load by using setting phantomjs.cli.args from here: https://github.com/angular/protractor/issues/150#issuecomment-128109354 Some basic settings have been configured and works fine though, this is the modified desiredCapabilities settings object:

desiredCapabilities: {
        'browserName': 'phantomjs',
        'phantomjs.binary.path': require('phantomjs').path,
        'phantomjs.cli.args': [
          '--ignore-ssl-errors=true',
          '--ssl-protocol=any', // tlsv1
          '--web-security=false',
          '--load-images=false',
          //'--debug=false',
          //'--webdriver-logfile=webdriver.log',
          //'--webdriver-loglevel=DEBUG',
        ],
        javascriptEnabled: false,
        logLevel: 'verbose'
}

And css/fonts optimization i 've found question raised on stack overflow How can I control PhantomJS to skip download some kind of resource? and the solution to this discussed there is something like this:

page.onResourceRequested = function(requestData, request) {
    if ((/http:\/\/.+?\.css/gi).test(requestData['url']) || requestData['Content-Type'] == 'text/css') {
        console.log('The url of the request is matching. Aborting: ' + requestData['url']);
        // request.abort(); 
        request.cancel(); 
    }
};

But I 'm not able trigger this function via in webdriverIO's configs desiredCapabilities object.. i.e., onResourceRequested()..

Can anyone tell me how can i call/define this function in my WebdriverIO script capabilities or any other way? Thanks.

Community
  • 1
  • 1
narainsagar
  • 1,079
  • 2
  • 13
  • 29
  • I don't see that you have indicated here why the `onResourceRequested` event handler did not work. It looks like the `cancel()` method should prevent resources from loading without any further configuration of PhantomJS. – halfer Feb 21 '16 at 21:00
  • What object type is `page`? I can't see that you have mentioned it elsewhere in your code, perhaps you have to do something to get the PhantomJS instance to use it? – halfer Feb 21 '16 at 21:02
  • thanks for the reply.. i've tried with just `.abort()` and also by simply `console.log('Hello World')` as well... nothing happens.. – narainsagar Feb 21 '16 at 21:04
  • page is phantomjs's webpage module's i.e., page = require('webpage') - as discussed in phantomjs's documentations – narainsagar Feb 21 '16 at 21:05
  • yeah, you're right but what to do to use it in webdriverIO? i don't get this anywhere on their docs.. thanks – narainsagar Feb 21 '16 at 21:06
  • Perhaps you could edit this question to indicate how you are initiating a PhantomJS operation, including your `page = require('webpage')` call. – halfer Feb 21 '16 at 21:07
  • I don't get - what you mean? – narainsagar Feb 21 '16 at 21:12
  • At present you have set `page.onResourceRequested` but I don't see evidence that `page` is in scope or used anywhere in this project at all. If you are using it, and this object is used to initiate the PhantomJS operation, then it may be helpful to readers to see the way in which this is being constructed and run. – halfer Feb 21 '16 at 21:13
  • First I'm sorry - my english is not very good that's why it's creating bit confusing.. and actually, that's the code taken from another stack overflow's post (also mentioned above as well) which say's (hopefully) that will solve the case... but when i tried i don't see - that's what i am asking in question i.e., `How would I trigger/call this func in webdriverIO`.. also brief here as well: http://phantomjs.org/api/webpage/handler/on-resource-requested.html – narainsagar Feb 21 '16 at 21:23
  • May we see the code you have that contains the `page = require('webpage')`? – halfer Feb 21 '16 at 21:25
  • No, I don't have... becoz i don't need that in my code... Actually for requiring `webpage` in the script needs to run the script via `phantomjs` like `$ phantomjs script.js` and as phantomjs docs say `phantom` isn't a node module (mean we can't use phantom directly in the node) that's why i'm using webdriver which bridge and makes the phantom to be used in the node directly.. [That's what i think i know]... – narainsagar Feb 21 '16 at 21:38
  • What I think is that `page` will be initiated in the `desiredCapabilities` object like what i did above for `--load-images=false` in `phantomjs.cli.args` etc.. something like `'phantomjs.page = onResponseRequested=callbackFunc'` though i tried with like that as well but it won't work too.. I want to know how it would be work via webdriverIO.. – narainsagar Feb 21 '16 at 21:39

0 Answers0