1
<canvas class="word-cloud-canvas" id="word-cloud-canvas-1892" height="270" width="320"></canvas>

How to get the URL source of an image from HTML5 canvas using Selenium Python?

I tried to use

driver.execute_script("return arguments[0].toDataURL('image/png');", canvasElement)

But it only return the binary? of the image. I don't want to save the image, but get the URL of the image. Is it possible?

masbro
  • 354
  • 1
  • 3
  • 12
  • Can you please clarify if other options (besides **selenium**) are acceptable and, if yes, review the solution provided? – Ionut Ticus Jun 20 '17 at 12:32

1 Answers1

-1

I faced a similar issue and the only alternative I could find is to use subprocess and phantomjs

Here is the Python code

import json, subprocess
output = check_output(['phantomjs', 'getResources.js', main_url])
urls = json.loads(output)
for url in urls:
    #filter and process URLs

and the Javascript file content

// getResources.js
// Usage: 
// phantomjs getResources.js your_url

var page = require('webpage').create();
var system = require('system');
var urls = Array();

page.onResourceRequested = function(request, networkRequest) {
    urls.push(request.url)
};

page.onLoadFinished = function(status) {
    setTimeout(function() {
        console.log(JSON.stringify(urls));
        phantom.exit();
    }, 16000);
};

page.onResourceError = function() {
    return false;
}
page.onError = function() {
    return false;
}

page.open(system.args[1]);

PhantomJS supports various options as well; for example to change the user agent you can use something like this:

page.settings.userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) ...';

This is a simplified version of this answer which I used for my issue.

Ionut Ticus
  • 2,683
  • 2
  • 17
  • 25