Get dynamically generated content with python Selenium

Question

This question has been asked before, but I've searched and tried and still can't get it to work. I'm a beginner when it comes to Selenium.

Have a look at: https://finance.yahoo.com/quote/FB

I'm trying to web scrape the "Recommended Rating", which in this case at the time of writing is 2. I've tried:

driver.get('https://finance.yahoo.com/quote/FB')
time.sleep(10)
rating = driver.find_element_by_css_selector('#Col2-4-QuoteModule-Proxy > div > section > div > div > div')
print(rating.text)

...which doesn't give me an error, but doesn't print any text either. I've also tried with xpath, class_name, etc. Instead I tried:

source = driver.page_source
print(source)

This doesn't work either, I'm just getting the actual source without the dynamically generated content. When I click "View Source" in Chrome, it's not there. I tried saving the webpage in chrome. Didn't work.

Then I discovered that if I save the entire webpage, including images and css-files and everything, the source code is different from the one where I just save the HTML.

The HTML-file I get when I save the entire webpage using Chrome DOES contain the information that I need, and at first I was thinking about using pyautogui to just Ctrl + S every webpage, but there must be another way.

The information that I need is obviosly there, in the html-code, but how do I get it without downloading the entire web page?

have you found any solution? – oldpride Aug 28 '22 at 04:15 — oldpride, Aug 28 '22 at 04:15

score 3 · Answer 1 · answered Mar 19 '19 at 11:48

3

Try this to execute the dynamically generated content (JavaScript):

driver.execute_script("return document.body.innerHTML")

See similar question: Running javascript in Selenium using Python

answered Mar 19 '19 at 11:48

Lena

162
1
12

1

Unfortunately, this doesn't change anything. I still get the same HTML-code as before. – PythonGeek Mar 19 '19 at 13:16

score 1 · Answer 2 · edited Aug 11 '21 at 19:36

1

First, you need to wait for the element to be clickable, then make sure you scroll down to the element before getting the rating. Try

element.location_once_scrolled_into_view
element.text

EDIT:

Use the following XPath selector:

'//a[@data-test="recommendation-rating-header"]//following-sibling::div//div[@class="rating-text Arrow South Fw(b) Bgc($buy) Bdtc($buy)"]'

Then you will have:

rating = driver.find_element_by_css_selector('//a[@data-test="recommendation-rating-header"]//following-sibling::div//div[@class="rating-text Arrow South Fw(b) Bgc($buy) Bdtc($buy)"]')

To extract the value of the slider, use

val = rating.get_attribute("aria-label")

edited Aug 11 '21 at 19:36

nic

169
1
9

answered Mar 19 '19 at 11:40

Mate Mrše

7,997
10
40
77

That CSS-selector works fine and it gives me 56, which is the "Total ESG-score", but it's not that element I'm trying to find. I'm trying to find the Recommended Rating, a scale from 1 to 5. I've tried with xpath, css_selector, class_name, but I can't get it to work. – PythonGeek Mar 19 '19 at 13:19

JeffC · Answer 3 · 2019-03-19T13:48:21.343

1

The CSS selector, div.rating-text, is working just fine and is unique on the page. Returning .text will give you the value you are looking for.

edited Mar 19 '19 at 13:48

answered Mar 19 '19 at 13:38

JeffC

22,180
5
32
55

score 0 · Answer 4 · answered Jun 14 '19 at 12:16

The script below answers a different question but somehow I think this is what you are after.

import requests
from bs4 import BeautifulSoup

base_url = 'http://finviz.com/screener.ashx?v=152&s=ta_topgainers&o=price&c=0,1,2,3,4,5,6,7,25,63,64,65,66,67'
html = requests.get(base_url)
soup = BeautifulSoup(html.content, "html.parser")
main_div = soup.find('div', attrs = {'id':'screener-content'})

light_rows = main_div.find_all('tr', class_="table-light-row-cp")
dark_rows = main_div.find_all('tr', class_="table-dark-row-cp")

data = []
for rows_set in (light_rows, dark_rows):
    for row in rows_set:
        row_data = []
        for cell in row.find_all('td'):
            val = cell.a.get_text()
            row_data.append(val)
        data.append(row_data)

#   sort rows to maintain original order
data.sort(key=lambda x: int(x[0]))

import pandas
pandas.DataFrame(data).to_csv("AAA.csv", header=False)

Get dynamically generated content with python Selenium

4 Answers4