10

I have a Python code snippet that uses the Selenium Webdriver to loop through some historical Baseball odds. This first part of the code is intended to get all the individual Game URL's from the schedule table (consisting of around 57 pages that need to be looped) and store them in a list.

The first time I tested this it worked just fine - now, for whatever reason, the driver.get() function seems to not be working. What happens is that the webdriver initiates the first .get() method in the pageRange loop (page 2), but after that, in the next iteration of the loop it gets stuck and doesn't navigate to page 3. No error message or crash.

Some manual error checking using print() indicates that all other areas of the code are doing fine. What could be the potential reasons for this issue?

EDIT 1: The Code actually gets stuck immediately after the first .get() call and not before the second, as stated above. I also noticed that the .get() function works just well later in the code when looping through Game URL's. For some reason it is specifically the "http://www.oddsportal.com/baseball/usa/mlb-2017/results/#/page/2/", ""http://www.oddsportal.com/baseball/usa/mlb-2017/results/#/page/3/", etc, that it gets stuck on.

season = str(2017)

URL = "http://www.oddsportal.com/baseball/usa/mlb-" + season + "/results/#/"
chrome_path = r"C:\Users\dansl110\Dropbox\Betting Project/chromedriver.exe"

OddsList = pd.DataFrame(columns=["Date", "HomeTeam", "AwayTeam", "HomeOdds", 
"AwayOdds", "Accuracy"])

GameURLs = []
StartURL = 2

#Gets GameURLs and EndPage from Page 1
driver = webdriver.Chrome(chrome_path)
driver.get(URL)
elems = driver.find_elements_by_xpath("//a[@href]")
for elem in elems:
    link = elem.get_attribute("href")
    if "/results/#/page/" in link:
        EndURL = int(''.join(c for c in link if c in digits))
    elif "/mlb" in link and len(str(link)) > 58 and "results" not in link:
        GameURLs.append(link)

PageRange = range(StartURL, EndURL - 5)

#Gets remaining GameURLs
for page in PageRange:
    oldURL = URL
    URL = "http://www.oddsportal.com/baseball/usa/mlb-" + season + 
    "/results/#/page/" + str(page) + "/"
    #This .get() works only during the first iteration of the range loop
    driver.get(URL)
    time.sleep(3)
    elems = driver.find_elements_by_xpath("//a[@href]")
    for elem in elems:
        link = elem.get_attribute("href")
        if "/nhl" in link and len(str(link)) > 65 and "results" not in link:
            GameURLs.append(link)
Daniel Slätt
  • 751
  • 2
  • 15
  • 28
  • Can you including the debugging output of your script? Also, it would be good to see the `URL` for each call to `driver.get()` as well as each `link` you find. – Ian Lesperance Feb 07 '18 at 15:53
  • I have updated the description with the URL's fed into driver.get(). There is no output in the command prompt showing any form of error description. I have put a print("Done") between the driver.get() and time.sleep(3) to see if it gets past this stage - it doesn't. It navigates to the page fed to the get() call, but doesn't proceed with the code after doing so. – Daniel Slätt Feb 07 '18 at 16:13
  • If you try to `print()` something before and after the first call to `find_elements_by_xpath()`, what do you see? – Ian Lesperance Feb 07 '18 at 16:20
  • The first one works as it should, I can see prints both before and after. For the second one, nothing shows up. The code clearly doesn't get that far in the first place. – Daniel Slätt Feb 07 '18 at 16:33
  • You’ll have to add more debugging to figure out where exactly it’s stalling. E.g., try adding a print at the start of every `for` loop iteration. (You may just be looping over an enormous number of elements.) If you do, it would help if you include the script *with* debugging statements as well as the *full*, raw shell output. – Ian Lesperance Feb 07 '18 at 16:47

3 Answers3

5

I had this same problem starting today. What I found was any of the machines that I had running versions 64.- of Chrome were having an intermittent hanging issue, but the machines running 63.- were not. go to chrome://settings/help and check which version: enter image description here

If you are running that version. try downloading the Chromedriver version here (2.35): https://sites.google.com/a/chromium.org/chromedriver/downloads

I tried this and it seemed to help a little with the hanging, but it still seems to be occurring.

The only thing that fixed it is going back to build 63.- for Chrome.

Hope it helps you.

EDIT:

I found this thread that will help! Add this to your script before you create the driver:

from selenium import webdriver

ChromeOptions = webdriver.ChromeOptions()
ChromeOptions.add_argument('--disable-browser-side-navigation')
driver = webdriver.Chrome('your/path/to/chromedriver.exe', chrome_options=ChromeOptions)

Once Chrome version 65.- comes out, it will be fixed. In the meantime, use the above if you are still on 64.-

PixelEinstein
  • 1,713
  • 1
  • 8
  • 17
2

Try to move your driver definition into the loop. I had the same issue and it worked for me. It is slowing a little bit the code but at least it works.

Stephane
  • 21
  • 1
-2

Have you tried using driver = webdriver.Firefox()? I believe it is more reliable and you can even use Selenium IDE.

Setti7
  • 149
  • 4
  • 16
  • Hi! Forgive me for perhaps asking a dumb follow up question to this, but don't I need to download the Firefox Driver as I did with the Chrome driver and then refer to its path, in order for it to work? – Daniel Slätt Feb 07 '18 at 15:11
  • I tried just changing that line of code and the error message "can't find the file" was the result. – Daniel Slätt Feb 07 '18 at 15:11
  • 1
    You need to download [GeckoDriver](https://github.com/mozilla/geckodriver/releases) and put it into PATH – Setti7 Feb 07 '18 at 15:20