Trying to use Selenium to Download Data from Web and Getting Weird Error

Question

Here is the code that I am testing.

import csv
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
profile = webdriver.FirefoxProfile()
profile.accept_untrusted_certs = True
import time


#browser = webdriver.Firefox(executable_path="C:/Utility/geckodriver.exe")
wd = webdriver.Firefox(executable_path="C:/Selenium/geckodriver.exe", firefox_profile=profile)
url = "https://finviz.com/login.ashx"
wd.get(url)

# set username
time.sleep(1)
username = wd.find_element_by_name("email")
username.send_keys("me@gmail.com")
#wd.find_element_by_id("identifierNext").click()

# set password
#time.sleep(2)
password = wd.find_element_by_name("password")
password.send_keys("me_pass")

# https://stackoverflow.com/questions/21350605/python-selenium-click-on-button
wd.find_element_by_css_selector('.button.is-primary.is-large').click()


# wait max 10 seconds until "theID" visible in Logged In page
time.sleep(5)
#content = wd.page_source
#print(BeautifulSoup(content, 'html.parser'))


url_base = "https://finviz.com/quote.ashx?t="
tckr = ['SBUX','MSFT','AAPL']
url_list = [url_base + s for s in tckr]
#print(url_list)

with open('C:\\stocks.csv', 'a', newline='') as f:
    writer = csv.writer(f)

    for url in url_list:
        #print(url)
        try:
            wd.get(url)
            fpage = wd.current_url
            #print(fpage)
            data = fpage.text
            fsoup = BeautifulSoup(data, 'html.parser')
            #print(url_base)
            print(fsoup)

            # write header row
            writer.writerow(map(lambda e : e.text, fsoup.find_all('td', {'class':'snapshot-td2-cp'})))

            # write body row
            writer.writerow(map(lambda e : e.text, fsoup.find_all('td', {'class':'snapshot-td2'})))            

        except:
            print("{} - not found".format(url))

In the case above, my code is going straight to the except, because the try fails. i think the problem is in this line:

fsoup = BeautifulSoup(data, 'html.parser')

This is my error:

AttributeError: 'str' object has no attribute 'text'

I looked at the documentation here:

https://www.selenium.dev/documentation/en/webdriver/web_element/

I guess the webdriver has to interact with BeautifulSoup, but for some reason they are not playing well together. That's my guess. I'm stuck now. Thoughts? Suggestions?

Frankly, you could do this even without `BeautifulSoup` using `wd.find_element_by_xpath()` or `wd.find_element_by_class_name()` — furas, May 27 '20 at 15:33

score 2 · Accepted Answer · answered May 27 '20 at 14:38

2

After your wd.get(url) do this:

fpage=wd.page_source
fsoup = BeautifulSoup(fpage, 'html.parser')

answered May 27 '20 at 14:38

0buz

3,443
2
8
29

That works! Thanks. One more thing. How can I get the 'tckr' to identify each set of records. Now I have no identifiers for all the data in each table? – ASH May 27 '20 at 15:14
You could integrate it into `writer.writerow`. I haven't tried it, but something like `writer.writerow(url[-4:] +',' + map(lambda e : e.text, fsoup.find_all('td', {'class':'snapshot-td2-cp'})))` – 0buz May 27 '20 at 15:21

Trying to use Selenium to Download Data from Web and Getting Weird Error

1 Answers1