0

This is the first question I've posted so do let me know if I should make the question clearer. Furthermore I've only just started out Python so I hope I can phrase the question right with the correct terms.

Basically I have created a customizable webscraper that relies on user's knowledge of CSS selectors. Users will first have to go to the website that they want to scrape and jot down the css selectors ("AA") of their desired elements and enter it in an excel file, in which the python script will read the inputs and pass it through browser.find_elements_by_css_selector("AA") and get the relevant text though .text.encode('utf-8')

However I noticed that sometimes there might be important information in the attribute value that should be scraped. I've looked around and found that the suggestion is always to include .get_attribute()

1) Is there an alternative to getting attribute values by just using browser.find_elements_by_css_selector("AA") without using browser.find_elements_by_css_selector("AA").get_attribute("BB"). Otherwise,

2) Is it possible for users to enter some value in "BB" in browser.find_elements_by_css_selector("AA").get_attribute("BB") such that only browser.find_elements_by_css_selector("AA") will run?

Jhhh
  • 1
  • 2
  • Not sure what you mean by "such that only browser.find_elements_by_css_selector("AA") will run?" but you can select using xpath. – AidanGawronski Jan 30 '18 at 01:26
  • "AA" and "BB" are user inputs. I was trying to ask if there is some user input value "BB" that can reduce the code to just browser.find_elements_by_css_selector("AA"). Eg. url = http://store.steampowered.com/search/?filter=topsellers "AA" = span.title, "BB" = someinputvalue . As someinput value is not found, the code will still run as browser.find_elements_by_css_selector("span.title"). – Jhhh Jan 30 '18 at 01:55
  • 1
    sounds like you just need to use try: ... except: – AidanGawronski Jan 30 '18 at 02:34

1 Answers1

0

Yes, there is an alternative to retrieve the text attribute values without without using get_attribute() method. I am not sure if that can be achieved through css or not but through xpath it is possible. A couple of examples are as follows :

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352