0

I'm scraping a website using Selenium. When I get the text of a list of elements (headers), this is what it prints:

    ['Countyarrow_upward Reportingarrow_upward Totalarrow_upward Bennet (D)arrow_upward Biden (D)arrow_upward Bloomberg (D)arrow_upward Booker (D)arrow_upward Boyd (D)arrow_upward Buttigieg (D)arrow_upward 
Castro (D)arrow_upward De La Fuente III (D)arrow_upward Delaney (D)arrow_upward Ellinger (D)arrow_upward Gabbard (D)arrow_upward Greenstein (D)arrow_upward Klobuchar (D)arrow_upward Patrick (D)arrow_upw
ard Sanders (D)arrow_upward Sestak (D)arrow_upward Steyer (D)arrow_upward Warren (D)arrow_upward Williamson (D)arrow_upward Yang (D)arrow_upward']

I obviously only want the names and the "(D)", so I tried using the replace() function to replace the Countyarrow_upward Reportingarrow_upward Totalarrow_upward and arrow_upward with an empty string. Here's my code:

headers = driver.find_elements_by_xpath('//*[@id="content"]/div/div[3]/div/div[2]/div/div[2]/div/div[2]/div[1]/div/table/thead/tr[1]')
    header_text = []
    for i in headers:
        header_raw_text = i.text
        header_raw_text.replace("Countyarrow_upward Reportingarrow_upward Totalarrow_upward ", "")
        header_raw_text.replace("arrow_upward ", "")
        header_text.append(header_raw_text)

    print(header_text)

When I run this code, I get the same thing above, and the replace() function doesn't work.

Help is much appreciated!

Ali Allam
  • 59
  • 1
  • 5
  • Because replace doesn't change the source string, it creates a new one. Just reassign what replace return to you var – slesh Apr 12 '20 at 21:49
  • Possible duplicate of [String replace doesn't appear to be working](https://stackoverflow.com/questions/26943256/string-replace-doesnt-appear-to-be-working) and many others. Short answer: you're ignoring the return value of `str.replace`. Strings are immutable in Python, so `replace` returns a new string with the substitution rather than altering the original string. – Brian61354270 Apr 12 '20 at 21:49
  • BTW, there are better ways to do web scraping than Selenium. Use an API, such as `requests`. – Keith Apr 12 '20 at 21:54

1 Answers1

2

strings are immutable. so header_raw_text.replace() does not change the string itself.you have to do reassign the result after replacing.

header_raw_text = header_raw_text.replace("arrow_upward ", "")
zealous
  • 7,336
  • 4
  • 16
  • 36