0

I have an instance created in aws EC2, with these characteristics:

  • platform: windows
  • version: Windows_Server-2019-English-Full-Base-2022.07.13
  • Type: t2.micro

I want to make a scrapper that runs every x time on this instance. I connect to the instance using RDP, launch it, and disconnect. The browser stays like in the image 1 (the inserted url and the blank screen without loading anything) the console starts writing exceptions like [2] . The weird thing is that the page only loads when I connect to the instance via RDP and scraper starts normally.

Could someone tell me what I'm doing wrong? or if some configuration is missing in the instance?

# undetected_chromedriver == '3.1.5r4'
# selenium == 4.3.0
# python == 3.10.5
# Chrome == Versión 104.0.5112.81 
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
import time

import undetected_chromedriver.v2 as uc
from selenium.webdriver.common.keys import Keys



def main(driver):
    driver.get("https://www.google.com/")
    WebDriverWait(driver, LONG_TIME).until(EC.presence_of_element_located((By.TAG_NAME, "form")))
    input = WebDriverWait(driver, LONG_TIME).until(EC.presence_of_element_located((By.TAG_NAME, "input")))
    input.send_keys('stackoverflow')
    input.send_keys(Keys.ENTER)
    search = WebDriverWait(driver, LONG_TIME).until(EC.presence_of_element_located((By.ID, "search")))
    titles = search.find_elements(By.TAG_NAME, "h3")
    for title in titles:
        print(title.get_attribute('textContent'))


if __name__ == '__main__':
    options = uc.ChromeOptions()
    LONG_TIME = 30
    driver = uc.Chrome(options=options, user_data_dir="C:\\temp\\profile")
    while True:
        try:
            main(driver)
        except Exception as e:
            print(e)
        time.sleep(60)

[2]:

Message:
Stacktrace:
Backtrace:
        Ordinal0 [0x012878B3+2193587]
        Ordinal0 [0x01220681+1771137]
        Ordinal0 [0x011341A8+803240]
        Ordinal0 [0x011624A0+992416]
        Ordinal0 [0x0116273B+993083]
        Ordinal0 [0x0118F7C2+1177538]
        Ordinal0 [0x0117D7F4+1103860]
        Ordinal0 [0x0118DAE2+1170146]
        Ordinal0 [0x0117D5C6+1103302]
        Ordinal0 [0x011577E0+948192]
        Ordinal0 [0x011586E6+952038]
        GetHandleVerifier [0x01530CB2+2738370]
        GetHandleVerifier [0x015221B8+2678216]
        GetHandleVerifier [0x013117AA+512954]
        GetHandleVerifier [0x01310856+509030]
        Ordinal0 [0x0122743B+1799227]
        Ordinal0 [0x0122BB68+1817448]
        Ordinal0 [0x0122BC55+1817685]
        Ordinal0 [0x01235230+1856048]
        BaseThreadInitThunk [0x77320419+25]
        RtlGetAppContainerNamedObjectPath [0x778377FD+237]
        RtlGetAppContainerNamedObjectPath [0x778377CD+189]
  • suppose you can use headless way, refer to https://stackoverflow.com/questions/53657215/running-selenium-with-headless-chrome-webdriver – justin Aug 09 '22 at 11:38
  • Due to the requirements they gave me, I cannot use headless mode :/ – Andy Ñaca Aug 09 '22 at 16:03

0 Answers0