Taking a whole page screenshot with Selenium Marionette in Python

Question

After the recent Firefox upgrade to version 47 we were forced to install the Marionette extension to keep being able to use selenium webdriver, and in my case also upgrade selenium from 2.52 to 2.53.

I use the python version of selenium webdriver to acquire high resolution images of maps rendered in HTML and JavaScript. previously this worked fine in firefox and the screenshots could be taken of the whole page, far beyond the dimensions of my own screen. However with the recent changes the screenshot is taken only of the area visible on screen. I use the following code:

import time
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

caps = DesiredCapabilities.FIREFOX
caps["marionette"] = True

browser = webdriver.Firefox(capabilities=caps)
browser.get(html_file)
time.sleep(15)

browser.save_screenshot(image_name)
browser.quit()

I have already considered: downgrading, stitching together several screenshots or switching to Qgis. However I would prefer a more elegant solution which would allow me to keep using the latest version of firefox and roughly the same methodology. Does anyone know a solution to this? perhaps by tricking selenium in thinking the viewport is larger? or by using another linux supported browser which does allow for the full page screenshot?

I think this is related: http://stackoverflow.com/questions/34607359/how-can-i-screenshot-the-full-height-of-a-mobile-form-factor. — alecxe, Jun 19 '16 at 12:38
Thanks. Some of the solutions given in the thread will still need the old Firefox versions though. Or they use the pan zoom method. For now I decided to switch back to Firefox 45 (extended support). — Stef Verdonk, Jun 21 '16 at 12:51
Looks like this feature was added to [python driver](https://github.com/SeleniumHQ/selenium/pull/7182/files) — Murali KG, Dec 15 '20 at 08:38

Martin Krung · Answer 1 · 2017-11-28T11:00:17.900

3

This is what I use, just stitch it:

#!/usr/bin/python
from selenium import webdriver
from PIL import Image
from cStringIO import StringIO

verbose = 1

browser = webdriver.Firefox()
browser.get('http://stackoverflow.com/questions/37906704/taking-a-whole-page-screenshot-with-selenium-marionette-in-python')

# from here http://stackoverflow.com/questions/1145850/how-to-get-height-of-entire-document-with-javascript
js = 'return Math.max( document.body.scrollHeight, document.body.offsetHeight,  document.documentElement.clientHeight,  document.documentElement.scrollHeight,  document.documentElement.offsetHeight);'

scrollheight = browser.execute_script(js)

if verbose > 0: 
    print scrollheight

slices = []
offset = 0
while offset < scrollheight:
    if verbose > 0: 
        print offset

    browser.execute_script("window.scrollTo(0, %s);" % offset)
    img = Image.open(StringIO(browser.get_screenshot_as_png()))
    offset += img.size[1]
    slices.append(img)

    if verbose > 0:
        browser.get_screenshot_as_file('%s/screen_%s.png' % ('/tmp', offset))
        print scrollheight


screenshot = Image.new('RGB', (slices[0].size[0], scrollheight))
offset = 0
for img in slices:
    screenshot.paste(img, (0, offset))
    offset += img.size[1]

screenshot.save('/tmp/test.png')

code also here: https://gist.github.com/fabtho/13e4a2e7cfbfde671b8fa81bbe9359fb

Problem with scrolling/stich are, that html nodes set to "display: fixed" keep repeating on every shot you do.

edited Nov 28 '17 at 11:00

answered Mar 01 '17 at 12:20

Martin Krung

1,098
7
22

For headers and footers, why don't you just get the size of the element, scroll the height of the viewport minus the minus the height of the fixed element, then crop from the second image to end, about the size of the repeting element; and finally stich them together. Thinking about it I should correct the "Just", it's not hard, but you have to write considerable code. – dmb Mar 09 '18 at 15:27
And how should I find the height of the fixed elements? Not only headers and footers can be fixed! So which height to use? A possible fix is to set all fixed element to "display: none", so repeating them is turned off, but the page will look quite different. – Martin Krung Jun 21 '18 at 11:35
1

@FabianThommen - It seems cStringIO not supported nowadays. Can you please update your answer? – Helping Hands Jan 29 '20 at 12:48
@HelpingHands Sorry, no time to do this and test it too. But you are welcome to edit my original answer, some help here: https://stackoverflow.com/questions/11914472/stringio-in-python3 – Martin Krung Jan 29 '20 at 15:06
Tried but does not work on confluence pages – user3754136 Oct 17 '22 at 20:12

score 3 · Answer 2 · answered Mar 28 '18 at 13:20

3

Got good results with this. It's headless, but for normal mode will be probably the same result.

from selenium import webdriver

firefox_options = webdriver.FirefoxOptions()
firefox_options.set_headless() 

firefox_driver = webdriver.Firefox(executable_path=<path_to_gecko_driver>, firefox_options=firefox_options)
firefox_driver.get(<some_url>)

firefox_elem = firefox_driver.find_element_by_tag_name('html')
firefox_elem.screenshot(<png_screenshot_file_path>)

answered Mar 28 '18 at 13:20

Pavel Cisar

181
1
5

Without the `headless` mode, it does not work as expected. See this post: https://stackoverflow.com/a/57338909/2943191. – Klaidonis Aug 03 '19 at 14:00

Taking a whole page screenshot with Selenium Marionette in Python

2 Answers2

Linked