3

I would like to get a screenshot of a full web page with Selenium python. To do that I used this answer.

But now I would like to divide the screenshot and limit the height of the resulting screenshot to 30 000px. For example, if a webpage height is 40 000px, I would like a first 30 000px screenshot and then a 10 000 px screenshot.

The solution should not be to save the full page and then crop the image, because I need it to work for very long webpages. Indeed it is not possible to get a screenshot with 120 000px height or you get this kind of error :

    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: [Exception... "Failure"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://marionette/content/capture.js :: capture.canvas :: line 154"  data: no]

I tried this but it does not wok at all :

from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from time import sleep

def save_screenshot(d, path):
    max_size = 30000
    original_size = d.get_window_size()
    required_width = d.execute_script('return document.body.parentNode.scrollWidth')+74
    required_height = d.execute_script('return document.body.parentNode.scrollHeight')+74
    if required_height <= max_size:
        d.set_window_size(required_width, required_height)
        d.find_element_by_tag_name('body').screenshot(path)  # avoids scrollbar
    else :
        for i in range(0,int(required_height/max_size)):
            d.set_window_position(0,max_size*i)
            d.set_window_size(required_width, max_size)
            d.find_element_by_tag_name('body').screenshot(path.replace(".png",str(i+1)+".png"))
        d.set_window_position(0,max_size*int(required_height/max_size))
        d.set_window_size(required_width, required_height%max_size)
        d.find_element_by_tag_name('body').screenshot(path.replace(".png",str(int(required_height/max_size)+1)+".png"))
        d.set_window_position(0,0)
    d.set_window_size(original_size['width'], original_size['height'])

if __name__ == '__main__':
    options = Options()
    options.headless = True
    driver = webdriver.Firefox(options=options,executable_path=r"C:/Users/path/geckodriver.exe")
    driver.get("http://www.lwb.com/")
    sleep(3)
    save_screenshot(driver,"C:/Users/path/test.png")
    driver.close()

Can someone help me here please ?

Thank you,

Mylha

mylha
  • 55
  • 1
  • 6
  • In Chrome you can use `screenshot()` to save only selected tag in file. In Firefox it doesn't work and it aways save full page. – furas Apr 27 '19 at 21:36

1 Answers1

-1

you can crop the saved screenshot using pillow pip install pillow

def crop_image(img, img_limit):
    img_width, img_height = img.size
    crop_dim = (0, 0, img_width, img_limit) # left top right bottom
    cropped_img = img.crop(crop_dim)
    return cropped_img

after doing save_screenshot(driver, "path/to/img"), do the following :

from PIL import Image

img_limit = 30000 # your image size limit
img = Image.open("path/to/img")
img = crop_image(img, img_limit)
img.save("path/to/img")

if you don't want to save the image before you manipulate it you can use get_screenshot_as_png, which will return binary data instead of saving it :

from PIL import Image
from io import BytesIO

img_limit = 30000 # your image size limit
img_binary = driver.get_screenshot_as_png()
img = Image.open(BytesIO(img_binary))
img = crop_image(img, img_limit)
img.save("path/to/save")

make sure to do del img and del img_binary when you're done to delete the binary data from memory

in order to take one screenshot of the entire page, do this:

from selenium import webdriver

DRIVER_PATH = "path/to/chrome/driver"
URL = "site.url"

options = webdriver.ChromeOptions()
options.add_argument("headless")
driver  =  webdriver.Chrome(executable_path = DRIVER_PATH, chrome_options = options)

# setting a long window height to take one full screenshot
driver.set_window_size(1920, 90000) # width , height
driver.get(URL)
driver.maximize_window()

img_binary = driver.get_screenshot_as_png()

PS : if you use this method to take the screenshot, you won't need pillow. simply use set_window_size to set the height and width of the window that you want, which will get you that same size in the screenshot

Mohamed Benkedadra
  • 1,964
  • 3
  • 21
  • 48
  • 2
    Thank you for your answer. But this is exactly what I don't want because when the web page is very long (for example 120 000 px), I cannot do save_screenshot in the first place... I described the error I would get in this case in my initial post. – mylha Apr 27 '19 at 21:07
  • @mylha check my answer , i have updated it with what you need – Mohamed Benkedadra Apr 27 '19 at 21:15
  • That's helpful but still the second screenshot is not working. I posted my new function in an answer if you have any idea. – mylha Apr 28 '19 at 14:23
  • @mylha what do you mean by the first and second screenshots ? – Mohamed Benkedadra Apr 28 '19 at 14:32
  • The webpage http://www.lwb.com/ is 30093 pixels long, I would like 2 screenshots : a first screenshot of 30 000 pixels long and a second one of 93 pixels long. This one is just an example : I need it for a webpage of 120 000 pixels for example, and I would get 4 screenshots for the whole page, since I cannot get the whole page with one screenshot. – mylha Apr 28 '19 at 14:41
  • @mylha you shouldn't be taking multiple screenshot, look at my answer again, i've updated it with a way to take one screen shot for the entire page ! – Mohamed Benkedadra Apr 28 '19 at 15:00
  • Thank you. So I cannot get a screenshot of a whole webpage when it is more than 40000 px height with Firefox ? Is it working only with Chrome ? – mylha Apr 28 '19 at 15:48
  • why can't you ? ... just change the driver to firefox, i'm using chrome in this example since i already had it installed, the solution here has nothing to do with the driver itself, the solution is basically setting the height of the page manually headless mode to as long of a height as you need in order to take your screenshot – Mohamed Benkedadra Apr 28 '19 at 15:51
  • With Firefox, it is not working with the webpage I need, which is 111 000 pixel long. With your last proposal, either I keep maximize_windows and I get no error but the screenshot is not full (only the visible portion), or I remove maximize_window and I get this error : raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: [Exception... "Failure" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: chrome://marionette/content/capture.js :: capture.canvas :: line 154" data: no]. – mylha Apr 28 '19 at 16:40
  • 1
    It works for webpages from 0 to 40 000 pixels long, but not for longer webpages. That's why I wanted to do several screenshot for long webpages.I did not try it with Chrome since I need it with Firefox. – mylha Apr 28 '19 at 16:40