1

I try to extract the prices from these page as text in USD from this site

I used an locator //span[@data-originalprice] with get text of selenium but still no only numbers, tried also split on \\$ and nothing came tried some regex text.split("^-?\\d*(\\.\\d+)?$") and still nothing. looking for any idea?

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
nuzooo
  • 133
  • 1
  • 11

1 Answers1

0

To extract and print the prices trimming the non-ASCII characters you can use replaceAll("[^\\p{ASCII}]", "") and using Java8's stream() and map() you can use either of the following Locator Strategies:

  • cssSelector:

    driver.get("https://www.wooloverslondon.com/new-styles?page=1&gender=161&style=77");
    System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.cssSelector("div.associated-product__price p>span"))).stream().map(element->element.getText().replaceAll("[^\\p{ASCII}]", "")).collect(Collectors.toList()));
    
  • xpath:

    driver.get("https://www.wooloverslondon.com/new-styles?page=1&gender=161&style=77");
    System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.xpath("//div[@class='associated-product__price']//p/span"))).stream().map(element->element.getText().replaceAll("[^\\p{ASCII}]", "")).collect(Collectors.toList()));
    
  • Console Output:

    [7,035.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 5,607.00, 5,607.00, 5,607.00, 4,996.00, 7,646.00]
    

References

You can find a couple of relevant discussions in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352