Consider this HTML page (Whitespaces were intentional):
<html>
<body>
<div>
<div id="1"><span class="3">$200</span></div>
</div>
</div>
<div id="1"><span class="2">$250</span></div>
</div>
<div><span class="1"> $400 </span>
</div>
</body>
</html>
Now, let's say in Python Selenium I want to find all instances of any currency amounts on this page. I would like then to find various CSS attributes from these elements.
What I have tried (which seems to be the wrong way), is to use a regex expression and finditer function to search the html source for these instances. For each of these instances I am then using Selenium find_elements method.
Here is the issue: The finditter function will find 3 instances. But when I take the matched text from finditter (in this case those 3 matches would be: $200, $250 and $400) and put it into the find_elements function, it finds several for each instance, it appears to be showing me each of the parent tabs as a separate instance:
currencypattern = re.compile("(?<=$)\d{1,5}(?:\,\d{3})?(?:\.\d+)?")
for currencies in currencypattern.finditer(htmlsource):
expr = ("$" + currencies.group())
for i in driver.find_elements("xpath", '//*[contains(normalize-space(), "' + expr + '")]'):
print(potential_prices.group(), " - ",i.tag_name)
the above code will print out the currency amount, and then the tag name, looped through until it has peached the top parent tag
If i just use find_element instead of find_elements, it seems to always return the top level tag when really I want the last child element and the CSS attributes from that.
Does anybody know how I can achieve this? I thought perhaps I could use the reg ex expression straight into the find_elements xpath but so far I've had no joy with this.
Thanks in advance.