I am trying to read urls from a column in Google Sheets using gspread and selenium. I think I have successfully read the data from the Sheet into a list, then convert to a DataFrame, then a list of jsons (to resolve a previous error). My code succeeds in opening the Chrome browser and if I print "urlb" or "url", the correct URL is displayed in the form I think it should be.
Here's the code section:
...
urls_list = worksheet.col_values(6)
urls = pd.DataFrame(urls_list)
list_of_jsons = urls.to_json(orient='records', lines=True).splitlines()
driver = webdriver.Chrome()
i = 1
while i < 5:
urlb = (list_of_jsons[i])
url = urlb.replace('\\', '')
driver.get(url)
...
However, I get this error:
Traceback (most recent call last):
File "<FILE PATH>/<FILE>.py", line 30, in <module>
driver.get(url)
File "<FILE PATH>/venv/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 355, in get
self.execute(Command.GET, {"url": url})
File "<FILE PATH>/venv/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 346, in execute
self.error_handler.check_response(response)
File "<FILE PATH>/venv/lib/python3.11/site-packages/selenium/webdriver/remote/errorhandler.py", line 245, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid argument
If I replace driver.get(url) with the actual URL driver.get('https://www.example.com/path'), the code works fine. At first I thought it was because I needed to add the single quotes at the beginning and end of the URL. I have done that through the program and also tried adding the quotes to the cells in Google Sheets. Neither helped. Can someone help me understand why I can't get it to load the URL from the Sheet cell?