How do I scrape content from a dynamically generated page using selenium and python?

Question

I have tried many attempts and all fail to record the data I need in a reliable and complete manner. I understand the extreme basics of python and selenium for automating simple tasks but in this case the content is dynamically generated and I am unable to find the correct way to access and subsequently record all the data I need.

The URL I am looking to scrape content from is structured similar to the following:

https://dutchie.com/embedded-menu/revolutionary-clinics-somerville/menu

In particular I am trying grab all info using something like -

browser.find_elements_by_xpath('//*[@id="products-container"]

Is this the right approach? How do I access specific sub elements of this element (and all elements of the same path)

I have read that I might need beautifulsoup4, but I am unsure the best way to approach this.

Would the best approach be to use xpaths? If so is there a way to iterate through all elements and record all the data within or do I have to specify each and every data point that I am after?

Any assistance to point me in the right direction would be extremely helpful as I am still learning and have hit a roadblock in my progress.

My end goal is a list of all product names, prices and any other data points that I deem relevant based on the specific exercise at hand. If I could find the correct way to access the data points I could then store them and compare/report on them as needed.

Thank you!

Check approaches here https://stackoverflow.com/questions/67148905/python-web-scraping-for-walmart/67161826#67161826, and https://stackoverflow.com/questions/67165356/feed-dataframe-with-webscraping/67166294#67166294 It's common question. — vitaliis, Apr 30 '21 at 23:23
This is a great start. I am getting lost at how I would select certain elements in my example, if the text I was after was contained in a DIV with the class of "product-information__Title-sc-65h5ke-4 eBIyJW" how would I approach this assuming the text at the end changes for instance? — T0ne, May 01 '21 at 01:00
It's a different question and should be asked separately. Usually locators should be unique. — vitaliis, May 01 '21 at 01:12

score 1 · Accepted Answer · answered May 01 '21 at 02:34

1

I think you are looking for something like

browser.find_elements_by_css_selector('[class*="product-information__Title"]')

This should find all elements with a class beginning with that string.

answered May 01 '21 at 02:34

C. Peck

3,641
3
19
36

How do I scrape content from a dynamically generated page using selenium and python?

1 Answers1