I'm scraping a website with Selenium / Python3, the website only uses invalid selectors like:
<input id="egg:bacon:SPAM" type="text"/>
<input id="egg:sausages:SPAM:SPAM" type="text"/>
(invalid parts are egg:bacon:SPAM
& egg:sausages:SPAM:SPAM
)
I did try to select these tags with:
driver.find_element_by_css_selector('input#egg:bacon:SPAM')
But of course I get selenium.common.exceptions.InvalidSelectorException
I also did try using xpath to get my tags, it works with:
driver.find_element_by_xpath('//input[@id="egg:bacon:SPAM"]')
But my code is based on a home made library based on CSS selectors. Adding XPATH support would require to add ~200 lines of code (without counting unit tests, documentation, etc..) only to handle this wrong and not generic behavior.
Plus, scraping this website is part of a bigger project where only this specific website use that kind of CSS selectors, pushing that much effort for a single website on 10 makes me uncomfortable.
I could use something like find_element_by_css_selector('.foo > input:nth-child(2)')
but it's pretty tricky and any small update on the DOM could break the scraper.
Is there any clean way to handle non valid css selectors via Selenium using find_element_by_css_selector
or am I doomed to use XPATH for this website?