2

I've seen solutions to this on other posts (mostly suggesting a longer waiting time), but have tried that and haven't had success.

Here's the error I'm getting:

Traceback (most recent call last):
  File "LobbyistsPrep.py", line 126, in <module>
    the_download = get_file(year, report, download_dir)
  File "LobbyistsPrep.py", line 28, in get_file
    Year.select_by_visible_text(year_text)
  File "C:\Python27\lib\site-packages\selenium\webdriver\support\select.py", lin
e 120, in select_by_visible_text
    self._setSelected(opt)
  File "C:\Python27\lib\site-packages\selenium\webdriver\support\select.py", lin
e 212, in _setSelected
    option.click()
  File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webelement.py",
line 80, in click
    self._execute(Command.CLICK_ELEMENT)
  File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webelement.py",
line 501, in _execute
    return self._parent.execute(command, params)
  File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", l
ine 308, in execute
    self.error_handler.check_response(response)
  File "C:\Python27\lib\site-packages\selenium\webdriver\remote\errorhandler.py"
, line 194, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: stale elemen
t reference: element is not attached to the page document
  (Session info: chrome=65.0.3325.181)
  (Driver info: chromedriver=2.33.506120 (e3e53437346286c0bc2d2dc9aa4915ba81d902
3f),platform=Windows NT 6.1.7601 SP1 x86_64)

Here's the relevant code:

def get_file(year_text, category, download_dir):
    # Store a list of files in the Downloads directory.
    # We will use this later to determine the filename of the the CSV we downloaded.
    downloads_before = os.listdir( download_dir )

    # Change the Year dropdown
    Year = Select(driver.find_element_by_name('ctl00$ctl00$ContentPlaceHolder$ContentPlaceHolder1$ddYear'))
    Year.select_by_visible_text(year_text)
    time.sleep(30)

    # Change the Expenditure Type dropdown
    Type = Select(driver.find_element_by_name('ctl00$ctl00$ContentPlaceHolder$ContentPlaceHolder1$ddExpType'))
    Type.select_by_visible_text(category)
    time.sleep(30)

    # Change the Report Month dropdown
    Month = Select(driver.find_element_by_name('ctl00$ctl00$ContentPlaceHolder$ContentPlaceHolder1$ddMonth'))
    Month.select_by_visible_text('-- All Available --')
    time.sleep(30)

    # Click the Export to CSV button (downloads the CSV file)
    driver.find_element_by_name('ctl00$ctl00$ContentPlaceHolder$ContentPlaceHolder1$btnExport').click()
    time.sleep(30)

    # Now that we have downloaded the file, lets check the Downloads directory again and compare.
    downloads_after = os.listdir( download_dir )
    downloads_change = set(downloads_after) - set(downloads_before)
    # If there is only one difference, then that file is the one we downloaded.
    if len(downloads_change) == 1:
        file_name = downloads_change.pop()
        file_path = download_dir + file_name
        return file_path
    # Otherwise, something went wrong: Either the number of files changed by MORE than one, or NOTHING was downloaded.
    else:
        return False

driver.get('http://mec.mo.gov/mec/Lobbying/Lob_ExpCSV.aspx')
time.sleep(30)

for report in reports_wanted:
    for year in years_wanted:
        the_download = get_file(year, report, download_dir)
        if the_download:
            if report == 'Group':
                print 'Downloaded ' + the_download + '. Adding to GROUP.  Report:\t' + year + '\t' + report
                group_files.append(the_download)
            else:
                print 'Downloaded ' + the_download + '. Adding to INDIV.  Report:\t' + year + '\t' + report
                files.append(the_download)
        else:
            print 'PROBLEM DOWNLOADING: \t' + year + '\t' + report

Our time.sleep used to be time.sleep(2) - I've tried changing it to 30, but that doesn't help, either.

I'm still pretty new to de-bugging scrapers, and this one wasn't built by me, so please be gentle. Thanks in advance.

jayohday
  • 55
  • 9
  • The element name seems auto generated. Are you sure that this name remains same always? – arshovon Apr 04 '18 at 18:23
  • It looks like you are snagging the element, then the page is loading again and it's going `stale`, also your waits should be placed a few lines up, before this line`Year = Select(driver.find_element_by_name('ctl00$ctl00$ContentPlaceHolder$ContentPlaceHolder1$ddYear'))`. There are better ways to deal with **dynamic load events**, such as waiting for an element to be present that loads last after the load event that causes your element to go stale. – PixelEinstein Apr 04 '18 at 18:37
  • @PixelEinstein - you mean the time.sleep should be moved? tried that, still get the same error. And I have seen similar feedback as "it looks like you are snagging the element, then the page is loading again and it's going stale" but I don't really know what that means/how to fix. Thanks again. – jayohday Apr 04 '18 at 18:45
  • 1
    Can you add the relevant HTML of the page you are interacting with? Or the **URL**? I can then build in answer that relates directly to your usecase. – PixelEinstein Apr 04 '18 at 18:58
  • http://mec.mo.gov/mec/Lobbying/Lob_ExpCSV.aspx Meanwhile, I'm trying explicit wait (WebDriverWait) but now getting a Timeout Exception. – jayohday Apr 04 '18 at 19:03
  • @PixelEinstein the URL was in the original code I posted; is that not what you're looking for? – jayohday Apr 04 '18 at 21:17
  • @jayohday, yes that works, did not see it first go through. Thanks. If someone has not helped you and you are still struggling with this, I will leave a detailed answer once I'm off work. – PixelEinstein Apr 04 '18 at 21:22
  • @jayohday, also, please update to `chromedriver 2.36` or **above** [HERE](https://sites.google.com/a/chromium.org/chromedriver/downloads), so we know you have a driver that supports your current version of chrome `build 65-`. – PixelEinstein Apr 04 '18 at 21:26
  • 1
    @PixelEinstein the update fixed it. Thank you. – jayohday Apr 04 '18 at 21:33
  • @jayohday, great! Glad it worked. – PixelEinstein Apr 04 '18 at 21:34
  • @jayohday, I will make a quick answer so people know that updating fixed your issue. – PixelEinstein Apr 04 '18 at 22:49

2 Answers2

0

Turning comment that fixed issue into answer for others.

What ended up fixing the above issue, was updating the Chromedriver to at least 2.36 since they are running on Chrome build 65, which is not supported by their current version of Chromedriver 2.33: https://sites.google.com/a/chromium.org/chromedriver/downloads

By keeping these up to date, or on a recommended pair, you will run into less problems as described on the chromedriver download landing page.


If you are looking for help with StaleElementReferenceException

Here is the definition from the wiki:

Thrown when a reference to an element is now “stale”.

Stale means the element no longer appears on the DOM of the page.

Possible causes of StaleElementReferenceException include, but not limited to:

You are no longer on the same page, or the page may have refreshed since the element was located.

The element may have been removed and re-added to the screen, since it was located. Such as an element being relocated. This can happen typically with a javascript framework when values are updated and the node is rebuilt.

Element may have been inside an iframe or another context which was refreshed.

Please refer to these:


PixelEinstein
  • 1,713
  • 1
  • 8
  • 17
0

I only have a solution in java, but it works.

If the exception is thrown I catch it and retry:

boolean isElementFound = false;

while(!isElementFound){

  try{

     WebElement myElement = Driver.findElement(By.id("elementID"));
     isElementFound = true;   

  }catch(StaleElementReferenceException e){
    //nothing!
  } 
}
AcMuD
  • 131
  • 9