1

I am trying to use selenium for scraping (the script used to work in python 3.7).

Last week I had to reset my PC and I installed the latest versions of python and all the packages used in the script.

What I observed was that none of the dynamic values are getting rendered and are displayed with header tags. Please see below some of the outputs:

<tr>
<td class="textsr">Close</td>
<td class="textvalue">{{ScripHeaderData.Header.Close}}</td>
</tr>

<tr>
<td class="textsr">WAP</td>
<td class="textvalue">{{StkTrd.WAP}}</td>
</tr>

<tr>
<td class="textsr">Big Value</td>
<td class="textvalue">{{checknullheader(CompData.BigVal)?'-':(CompData.BigVal)}}</td>
</tr>

I have been using the script for my research purpose and need it back in shape, hence appreciate any guidance.

Here's the snippet for reference:

target_url = q.get(timeout=1)
time.sleep(1)
driver = webdriver.Chrome('./chromedriver',options=opts)
driver.get(target_url)
# this is just to ensure that the page is loaded
time.sleep(5)
    
html_content = driver.page_source
    
soup = BeautifulSoup(html_content, features="html.parser")
    
table_rows = soup.find_all('tr')
for row in table_rows:
    table_cols = row.find_all('td')
    for col in table_cols:
        label_value = col.text
Smog
  • 41
  • 4

2 Answers2

2

I had referred a lot of forums and tried many suggestions (waits, driver options, changing web drivers, switching content etc.) however my issue seems to be more specific and did not get resolved.

Eventually fell back to my old setup (runs python 3.9.6) and then it went back to working state.

Thanks to you Joe Carboni for your time and inputs on this.

It is a bit frustrating that I could not find the root cause of the issue and a workaround to resolve it. But just posting what I did here in case if it helps someone, cheers.

Smog
  • 41
  • 4
0

While it may be tempting to use time.sleep to wait for the page to load, it's better to use Selenium Waits with conditions to wait for, likely related to the elements you want. https://www.selenium.dev/documentation/webdriver/waits/

Here's another thread with a good answer about Waits and conditions vs. time.sleep: How to sleep Selenium WebDriver in Python for milliseconds

Joe Carboni
  • 421
  • 1
  • 6