2

So the issue is very simple. I have a code which I need to run in headless mode. This program works perfectly in non headless mode (when the browser automatically opens by selenium), but the moment I disable headless, it wont even start.

import requests
from bs4 import BeautifulSoup
import csv
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

chromepath = r"C:\Users\hp\Desktop\webScrape\chromedriver\chromedriver.exe"
options = Options()
#options.add_argument("--headless")
options.add_argument('--disable-gpu')
options.add_argument('--log-level=3')
options.add_argument('--lang=en')
driver = webdriver.Chrome(executable_path=chromepath, chrome_options=options)
url = "https://eresearch.fidelity.com/eresearch/evaluate/fundamentals/ownership.jhtml?stockspage=ownership&symbols=AAPL"
driver.get(url)
print("driver got url")

if driver.current_url == "https://login.fidelity.com/ftgw/Fidelity/RtlCust/Login/Init/df.chf.ra/trial?AuthRedUrl=https://oltx.fidelity.com/ftgw/webxpress/AuthorizeMember&AuthOrigUrl=https://snapshot.fidelity.com/fidresearch/gotoBL/snapshot/landing.jhtml#/dividends?symbol=AAPL":


        username = driver.find_element_by_id("userId")
        password = driver.find_element_by_id("password")

        username.send_keys("xxxx")
        password.send_keys("xxxx")

        login = driver.find_element_by_xpath('//*[@id="Login"]/ol/li[4]/button/b').click()

        driver.get(url)
        #print(driver.current_url)


        button = element = WebDriverWait(driver, 60).until(
            EC.element_to_be_clickable((By.XPATH, '//*[@id="tab1"]/a'))
        )

        button.click()
        print("clicked")
#use webdriver wait for everything else
# table = wait.until(EC.presence_of_element_located(By.CSS_SELECTOR, 'div.datatable'))
try:
    WebDriverWait(driver, 60).until(EC.presence_of_element_located((By.XPATH, '/html/body/table/tbody/tr/td[4]/div[5]')))

except:
    pass  # Handle the exception here


# thlist = []
# tdlist = []   

# my_table_th = driver.find_elements_by_tag_name('th')
# for i in range(0,len(my_table_th)):
    # if my_table_th[i].text == "":
        # continue
    # else:
        # thlist.append(my_table_th[i].text)


# my_table_td = driver.find_elements_by_tag_name('td')
# for i in range(0,len(my_table_td)):
    # if my_table_td[i].text == "":
        # continue
    # else:
        # tdlist.append(my_table_td[i].text)        

# thlist = thlist[8:]       
# for i in range(0,len(thlist)):
    # print(i,thlist[i])

# print("-----------------------------------------------------")

# for i in range(0,len(tdlist)):
    # print(i,tdlist[i])
mylist = [] 

soup = BeautifulSoup(driver.page_source,"html.parser")
print("bs got the site")
requests.packages.urllib3.disable_warnings()

#table borderTop table-striped dividendHistory
divparent = soup.find_all('div', attrs={'class':'tabbed-box'})
#print (len(divparent))
"""
table 1 is class left side and chart-table
table 2 is class right side and institutional-table

"""
try:
        my_table = divparent[0].find_all('div', attrs = {'class':'left-side'})
        #print((my_table))
except:
    print("no table div here!")
    #return
#try:
extractTable = my_table[0].find_all('table', attrs = {'class':'chart-table'})
rows = extractTable[0].findChildren(['tr'])
for row in rows:
    for data in row.findAll('th'):
        if data.text == "":
            continue
        else:
            print(data.text)

driver.close()
print("done ^_^")

So when the headless option is commented, it works perfectly, but when it is non commented, the program wouldnt even start. The only thing I would see on my console is:

  driver = webdriver.Chrome(executable_path=chromepath, chrome_options=options)

DevTools listening on ws://127.0.0.1:62897/devtools/browser/df3dccb4-4b97-4b06-8ca3-545d64ca2807

It would never even proceed to the first print output which is

print("driver got url")

Can somebody please help with this?

1 Answers1

0

SOLUTION:

When defining the driver I've changed the keyword chrome_options to options. Downloaded the correct chromedriver (matching my chromebrowser).

Explenation:

Copied (the first bit) of your code and ran into a similar problem. Made a few changes and got it to start running (headless). Here is the first part of your code (altered) that would run:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import os, platform

BASE_DIR = os.path.dirname(os.path.abspath(__file__))

if platform.system() == 'Windows':
    chromepath = BASE_DIR + r'\chromedriver.exe'
    chromepath = chromepath.replace("\\","/")

options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--log-level=3')
options.add_argument('--lang=en')


print("define driver")
driver = webdriver.Chrome(executable_path=chromepath, options=options)

print("define url")
url1 = "https://eresearch.fidelity.com/eresearch/evaluate/fundamentals/ownership.jhtml?stockspage=ownership&symbols=AAPL"
print("define url2")
url2 = "https://youtube.com"

print("driver: start getting url2")
driver.get(url2)
print("driver got url2")

print("driver: start getting url1")
driver.get(url1)
print("driver got url1")

Note: I've made the assumption that the chromedriver executable is in the same directory as your .py file.

Additional issue:

Also I added some print-statements to check what bit of the code get's executed. Doing this i found out that the driver doesn't seem to get your URL (waited for 5 minutes and it was still running). When using a different URL the code does run, so the cause of this (different) probably lies with the website your trying to visit.

When running the code not-headless, the webdriver has no problem visiting your url. You might want to add a different post/issue for this.

Ewald
  • 155
  • 5