4

I need to take a screenshot of an entire web page. The important part here is that I need the screenshot to include the entire contents of the page that doesn't fit on the screen.

The data contains multiple lines (rows) of data and due to long length of data, it has a scroll bar. The number of lines vary every time and the screenshot should be based accordingly.

For long web pages that scroll, it is trivial to perform this task. But how can it be accomplished when data is big and gets under the scroll bar.

I would like to accomplish this using Python. I am using the below code to capture the screenshot of the web page using Python.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('window-size=1440x1440')
driver = webdriver.Chrome(executable_path=os.path.abspath('C:/Program Files (x86)/Python36/selenium/chromedriver/build/scripts-3.6/chromedriver.exe'),chrome_options=options)
driver.get("https://www.test.com") ##updated as a random test URL
time.sleep(60);
driver.save_screenshot('C:/Users/Dev/Desktop/Maxx/Snapshots/test.png')
driver.quit
print ("captured snapshot")

Data on how it looks on the browser with scroll bar.

enter image description here

sdgd
  • 723
  • 1
  • 17
  • 38
  • Just keep scrolling..don't forget selenium is just a browser simulator `driver.execute_script("window.scrollTo(0,{0})".format(scrollHeight))` – user1767754 Dec 05 '17 at 08:00
  • the screenshot attached in the question is only a part of the webpage. the first half of the web page contains some graphs and the lower half of the web page contains the table with data related to the graphs. – sdgd Dec 05 '17 at 11:37
  • @user1767754 tried with the command you mentioned, still see the scroll bar in the snapshot and not the complete list of data. – sdgd Dec 05 '17 at 11:37
  • That's probably your least problem... you could either overwrite the css or just use opencv to crop this part – user1767754 Dec 05 '17 at 16:45

1 Answers1

5
from PIL import Image
from io import BytesIO

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

def open_url(url):
    options = Options()

    options.headless = True

    driver = webdriver.Chrome(chrome_options=options)

    driver.maximize_window()
    driver.get(url)
    save_screenshot(driver, 'screen.png')

def save_screenshot(driver, file_name):
    height, width = scroll_down(driver)
    driver.set_window_size(width, height)
    img_binary = driver.get_screenshot_as_png()
    img = Image.open(BytesIO(img_binary))
    img.save(file_name)
    # print(file_name)
    print(" screenshot saved ")


def scroll_down(driver):
    total_width = driver.execute_script("return document.body.offsetWidth")
    total_height = driver.execute_script("return document.body.parentNode.scrollHeight")
    viewport_width = driver.execute_script("return document.body.clientWidth")
    viewport_height = driver.execute_script("return window.innerHeight")

    rectangles = []

    i = 0
    while i < total_height:
        ii = 0
        top_height = i + viewport_height

        if top_height > total_height:
            top_height = total_height

        while ii < total_width:
            top_width = ii + viewport_width

            if top_width > total_width:
                top_width = total_width

            rectangles.append((ii, i, top_width, top_height))

            ii = ii + viewport_width

        i = i + viewport_height

    previous = None
    part = 0

    for rectangle in rectangles:
        if not previous is None:
            driver.execute_script("window.scrollTo({0}, {1})".format(rectangle[0], rectangle[1]))
            time.sleep(0.5)
        # time.sleep(0.2)

        if rectangle[1] + viewport_height > total_height:
            offset = (rectangle[0], total_height - viewport_height)
        else:
            offset = (rectangle[0], rectangle[1])

        previous = rectangle

    return (total_height, total_width)

open_url("https://www.medium.com")

scroll_down function scrolling to the bottom of the page and return total height and width of the webpage.

save_screenshot function set the window size and save the screenshot using pillow.

abhay kumar
  • 83
  • 1
  • 10
  • I guess it will work for a page and scroll down to get total height and width, for iframe we need to change according to that. if you can provide that type of example which you are trying to take screenshots that might help for giving a correct explanation? – abhay kumar Feb 05 '20 at 17:50
  • Sure. If you visit this page : [sampleIframe](https://www.dyn-web.com/tutorials/iframes/basics/demo.php) , you will see iframe where you will also see scroll bar. I tried lot of codes but still no code is able to take full screenshot of iframe by scrolling it. All codes take screenshot like this : [screenshot](https://i.stack.imgur.com/IkodE.png) – Helping Hands Feb 05 '20 at 17:53
  • Great Solution. Do you know how i can use this code for a password protected website? – msr_003 Mar 16 '20 at 16:06
  • yes, we can the first login the website and then use further, i will add that snippet later for take screenshot of password protected – abhay kumar Mar 17 '20 at 06:46