0

EDIT

Ok, so here is my whole code so far:

from selenium import webdriver
from bs4 import BeautifulSoup as bs
import requests
import time
import os
import Tkinter as tk




def get_page():
    global driver
    driver = webdriver.Chrome()
    driver.get(url)
    last_height = driver.execute_script('return 
                      document.body.scrollHeight')
    while True:
        driver.execute_script('window.scrollTo(0, 
        document.body.scrollHeight);')
        new_height = driver.execute_script('return 
        document.body.scrollHeight')
        if new_height == last_height:
            break
        else:
            last_height = new_height


#This function uses BeautifulSoup to parse through the page source and find images.
    def get_img():

        sp = bs(driver.page_source, 'html.parser')
        for image in sp.find_all('img'):
            images.append(image)


#Create folder which will contain downloaded images.
    def make_dir():
        if not os.path.exists('Downloaded images'):
            os.mkdir('Downloaded images')
        os.chdir('Downloaded images')


#Function which saves images.
    def save_img():

        x = 0

        for image in images:
            try:
                url = image['src']
                source = requests.get(url)
                with open('img-{}.jpg'.format(x), 'wb') as f:
                    f.write(requests.get(url).content)
                    x += 1
            except:
                print 'Error while saving image.'

root = tk.Tk()
root.title('Image Scraper 1.0')
tk.Label(root, text = 'Enter URL:').grid(row=0)
e1 = tk.Entry(root)
e1.grid(row=0, column=1)
e1.insert(driver.get(url))
button1 = tk.Button(root, text = 'SCRAPE', command =scrape_site).grid(row=3, column=1, sticky=tk.W, pady=4)
button1.pack()

root.mainloop()

I tried to put the whole scrape_site function in tkinters button command=, which is stupid i see this now, and obviously it doesn't work. As you can see, I copied whole tkinter code to main scraper file. Any thoughts? I will appreciate any input :)


I recently posted my question about web scraper, which downloaded images of cats. This time I decided, that I will make another step forward. I want to make GUI web scraper which will download images from website which user will input in tkinter Entry widget. Is this even possible? I also created two .py files: one for script of scraper and second one for gui. Can it be stored this way or should it be one file? Here is the scraper code which opens and scrolls page (using selenium), it works fine. My only question is: how to put it in tkinter? :)

def get_page():
    global driver
    driver = webdriver.Chrome()
    driver.get(url)
    last_height = driver.execute_script('return document.body.scrollHeight')
    while True:
        driver.execute_script('window.scrollTo(0, 
               document.body.scrollHeight);')
        new_height = driver.execute_script('return 
               document.body.scrollHeight')
        if new_height == last_height:
            break
        else:
            last_height = new_height
get_page()
piotrulu
  • 13
  • 1
  • 8
  • You haven't shared any tkinter code. As for how to 'put it in to tkinter' you could change your get_page function to accept `url` as an argument. The tkinter code can then just call this function and pass the URL from the entry widget. – scotty3785 Mar 19 '18 at 11:39

1 Answers1

0

As mentioned in my comment, you should modify get_page to take url as an argument. The simple example below shows how this might work but with the get_page function replaced (I don't have selenium).

try:
    import tkinter as tk
except:
    import Tkinter as tk

def get_page(url):
    print("Getting cats from {}".format(url))

class App(tk.Frame):
    def __init__(self,master=None,**kw):
        tk.Frame.__init__(self,master=master,**kw)
        self.txtURL = tk.StringVar()
        self.entryURL = tk.Entry(self,textvariable=self.txtURL)
        self.entryURL.grid(row=0,column=0)
        self.btnGet = tk.Button(self,text="Get Some Cats!",command=self.getCats)
        self.btnGet.grid(row=0,column=1)

    def getCats(self):
        get_page(self.txtURL.get())


if __name__ == '__main__':
    root = tk.Tk()
    App(root).grid()
    root.mainloop()

You can enter the URL in to the Entry widget, press the button and the URL is sent to the function.

If your get_page function is in a separate file, just import it using from my_other_file import get_page where is the name of the python file that contains the get_page function

scotty3785
  • 6,763
  • 1
  • 25
  • 35
  • scotty3785, may I ask what is the last if __name__=='__main__': does? I see it second time in my life and i would like to know what it is :P – piotrulu Mar 19 '18 at 22:08
  • There is already an [answer for that](https://stackoverflow.com/questions/419163/what-does-if-name-main-do) – scotty3785 Mar 20 '18 at 09:23
  • Does my response answer your question? If it does you should mark it as answered. – scotty3785 Mar 20 '18 at 09:24
  • Partly yes. Thank you. – piotrulu Mar 20 '18 at 10:21
  • Which part is still un-answered? Happy to update my question if I can. – scotty3785 Mar 20 '18 at 11:32
  • It's probably more of my inability to comprehend and wraping my mind around the whole class thing. I should probably revise this part of python syntax ;) But also I was wondering if my whole code could be made into class form? Because now I put my scraper function into your version of tkinter app and it works fine, except it scrapes and saves only few images. It' probably because the interpreter loops through program only once. I don't know how to put it in words without sounding stupid (delivering messages about technical issues should also be revised by me :P ) – piotrulu Mar 20 '18 at 14:03
  • Much of your code doesn't need converting to 'class form'. They are standalone functions. You could convert it to a class and then you could avoid using a global for the `driver` variable but at this point not worth it. – scotty3785 Mar 20 '18 at 14:55
  • I'd suggest adding some print statement to your functions to see why it performs differently when you add the GUI code to it. There is nothing obvious that I can see without having to run it myself. – scotty3785 Mar 20 '18 at 14:57
  • Thanks again mate ;) – piotrulu Mar 21 '18 at 08:53