1

Below is my code to extract data from a website

from selenium import webdriver 
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
import numpy as np
from csv import writer
from selenium.webdriver.chrome.options import Options
import time


options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-logging'])

# Initialize the webdriver
Path= r"C:\Users\xxx\Downloads\Python\chromedriver.exe"

#Configuration
driver  = webdriver.Chrome(Path)

# Navigate to the website containing the HTML code
driver.get("https://app.mohiguide.com/search/result?type=3&parent=6")

# Find the element with the data-src attribute using class name
element = driver.find_element(By.CLASS_NAME, 'title')

# Extract the value of the data-src attribute
data_src = element.get_attribute("data-src")

# Print the value of data-src
print(data_src)

# Close the webdriver
driver.quit()

However, I got the following error:

[4192:22976:0722/130914.625:ERROR:cert_issuer_source_aia.cc(34)] Error parsing cert retrieved from AIA (as DER):
ERROR: Couldn't read tbsCertificate as SEQUENCE
ERROR: Failed parsing Certificate

It looks like the certificate error, but I try to use selenium one year before and it runs successfully.

jasondesu
  • 57
  • 6
  • read this post https://stackoverflow.com/questions/75771237/error-parsing-cert-retrieved-from-aia-as-der-error-couldnt-read-tbscertifi – Golil Jul 22 '23 at 05:26
  • No I tried, but still the same – jasondesu Jul 22 '23 at 05:32
  • Did you try to update your python packages? Do `pip install -U pip`// `pip install -U certifi` //`pip install -U selenium`, and also visit Selenium documentation to see how to initialize the driver in the latest version. Let me know if you want a full answer. – Barry the Platipus Jul 22 '23 at 06:53

2 Answers2

1

This error message...

[4192:22976:0722/130914.625:ERROR:cert_issuer_source_aia.cc(34)] Error parsing cert retrieved from AIA (as DER):
ERROR: Couldn't read tbsCertificate as SEQUENCE
ERROR: Failed parsing Certificate

...implies that most likely there is an expired ca public cert in the browsers cert store.


Solution

Prerequisites

  • Ensure that you are using Python 3.7+ and Selenium v4.6.0+

You can ignore the certificate errors and exclude them from logging as follows:

from selenium import webdriver 
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("start-maximized")
options.add_argument("--ignore-certificate-errors")
options.add_argument("--allow-running-insecure-content")
options.add_experimental_option('excludeSwitches', ['enable-logging'])
driver = webdriver.Chrome(options=options)
driver.get("https://app.mohiguide.com/search/result?type=3&parent=6")
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CLASS_NAME, "title"))).get_attribute("data-src"))
driver.quit()

Console Output:

None

As the first matching element:

<h1 class="title" data-v-37414d58="">Pet villa(西營盤)</h1>

doesn't contains the attribute data-src. Hence None is printed.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
0

you have problem in check certs. you can add args to webdriver to ignore ssl errors

    options.add_argument('--ignore-certificate-errors')
    options.add_argument('--allow-running-insecure-content')
Golil
  • 467
  • 4
  • 12