1

I have a webpage that I need to take the screen shot first and then use OCR to parse out the texts inside. The performance of OCR could be dramatically improved if I zoom in(Mac: command + '='). So I am wondering how could I zoom in/out using selenium in Python.

There is a similar post but they only have the implementations in Java and C#, but the goal is the same as mine.

Zoom in/out in selenium is just one of my thoughts. To improve the performance. I know there might be several ways to implement. Below are just my thoughts and I never successfully implement them. If you could prove them work and change the font size, I would also accept as answer.

  1. Maybe change the settings of the browser and then save as a Chrome profile, so next time, I could just call the profile and the 'ZOOM' settings should be preserved throughout the whole process without touching anything. However, seems like python selenium package doesn't support load chrome profile, however, it could load firefox profile. Link

  2. Maybe take the screen shot as a vector image, so use PIL etc. to amplify the font size separately.

...

Thanks a lot for the post and the a sample code to get you started!

#!/usr/bin/python

from selenium import webdriver

def main():
    browser = webdriver.Chrome()  # Sorry, I have to use Chrome, [chromedriver][3] is required
    browser.set_window_size(1000, 1000)    
    browser.get("https://stackoverflow.com/users/1953475/b-mr-w")

    # Fill in Your Magic Here to Make the Font Size Big!

    browser.get_screenshot_as_file('/tmp/screenshot.png')

if __name__ == '__main__':
    main()
Community
  • 1
  • 1
B.Mr.W.
  • 18,910
  • 35
  • 114
  • 178

1 Answers1

3

You can zoom the content div via execute_script("$('#content').css('zoom', 5);"):

#!/usr/bin/python

from selenium import webdriver
import time


def main():
    browser = webdriver.Chrome()
    browser.set_window_size(1000, 1000)
    browser.get("http://stackoverflow.com/users/1953475/b-mr-w")

    browser.execute_script("$('#content').css('zoom', 5);")
    time.sleep(5)

    browser.get_screenshot_as_file('screenshot.png')

if __name__ == '__main__':
    main()

But, there is a problem with zooming: get_screenshot_as_file won't show you the whole page - it'll make an image from what it sees (with scrolls).

Why do you need an OCR here? What about getting the text using html2text module?

Hope that helps.

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • The web that I am working on is pretty anti-scraping. Correct me if I was wrong but the whole web is built by javascript. When you were on the page, you 'INSPECT ELEMENT', you could see the source code of each element, which is perfect. However, if you do print browser.page_source, it only has a few lines of javascript function. That is why I am using OCR. I have never used the HTML2TEXT module before, and I would assume it only works for the web where you could get the complete html right? – B.Mr.W. Aug 18 '13 at 22:03
  • Ok, selenium usually helps to deal with this js-driven web. What url are you trying to scrape from and what kind of data? – alecxe Aug 18 '13 at 22:08
  • sorry that I cannot give those information, however your solution is very inspiring and I would accept yours as answer if no one else give a better solution. Could you explain a little bit the the browser.execute_script part.... seems like you could control the css layout of specific elements... – B.Mr.W. Aug 18 '13 at 22:38
  • Thanks. Yup, selenium is pretty powerful and you can execute arbitrary js code using `execute_script`. I'm still not sure, why just selenium cannot help with extracting the data from a page - it's a real web browser and you can get from the page everything you see.. – alecxe Aug 18 '13 at 22:42
  • execute customized javascript in selenium is very interesting. When I open up the page in Chrome, using JavaScript Console, if I typed in document in the console, I could see the complete populated html by javascript. Is it possible to use selenium to store those data, maybe to a flat file. I have no problem parsing it afterwards. Something like result=browser.execute_script("document") print result... thanks – B.Mr.W. Aug 19 '13 at 17:40