My problem:
I was try crawl Google People Also Ask with selenium and my code write with python, but I have problem when internet slowly. When I click more question, it will show loader icon of Google with this HTML:
<g-loading-icon jsname="aZ2wEe" class="nhGGkb S3PB2d" style="height: 24px; width: 24px; display: none;"><img height="24" src="//www.gstatic.com/ui/v1/activityindicator/loading_24.gif" width="24" alt="Đang tải..." role="progressbar" data-atf="0" data-frt="0"></g-loading-icon>
Note this: It will show when click more result Google People Also Ask and internet slowly. When load complete g-loading-icon will hiden.
I was test and I think Xpath will change any time with structure of Google result. So I want code to wait until it loading complete to crawl not fail. Because if not waiting it load complete the code will have error: IndexError: string index out of range.
I don't want use time.sleep because I think it not best way.
Case 1: I was try catch with Xpath. But Xpath will change when structure of result Google change.
This is my code for case 1:
def click_more_gpaa(order):
# Click button question
# Condition for check load question
short_timeout = 10 # give enough time for the loading element to appear
long_timeout = 30 # give enough time for loading to finish
loading_element_xpath = '/html/body/div[7]/div/div[9]/div[1]/div/div[2]/div[2]/div/div/div[2]/div/div/div[1]/g-loading-icon'
loading_element_css_selector = 'nhGGkb.S3PB2d'
try:
# Case 1: Test with Xpath
# wait for loading element to appear
# - required to prevent prematurely checking if element
# has disappeared, before it has had a chance to appear
is_gppa_show = WebDriverWait(driver, short_timeout).until(
EC.presence_of_element_located((By.XPATH, loading_element_xpath))
)
# then wait for the element to disappear
is_gppa_show = WebDriverWait(driver, long_timeout).until_not(
EC.presence_of_element_located((By.XPATH, loading_element_xpath)))
except TimeoutException:
# if timeout exception was raised - it may be safe to
# assume loading has finished, however this may not
# always be the case, use with caution, otherwise handle
# appropriately.
pass
Case 2: I was try catch with CSS Selector. But it not work.
This is my code for case 2:
def click_more_gpaa(order):
# Click button question
# Condition for check load question
short_timeout = 10 # give enough time for the loading element to appear
long_timeout = 30 # give enough time for loading to finish
loading_element_xpath = '/html/body/div[7]/div/div[9]/div[1]/div/div[2]/div[2]/div/div/div[2]/div/div/div[1]/g-loading-icon'
loading_element_css_selector = 'nhGGkb.S3PB2d'
try:
# Case 2: Test with CSS_SELECTOR
# wait for loading element to appear
# - required to prevent prematurely checking if element
# has disappeared, before it has had a chance to appear
is_gppa_show = WebDriverWait(driver, short_timeout).until(
EC.presence_of_element_located((By.CSS_SELECTOR, loading_element_xpath))
)
# then wait for the element to disappear
is_gppa_show = WebDriverWait(driver, long_timeout).until_not(
EC.presence_of_element_located((By.CSS_SELECTOR, loading_element_xpath)))
except TimeoutException:
# if timeout exception was raised - it may be safe to
# assume loading has finished, however this may not
# always be the case, use with caution, otherwise handle
# appropriately.
pass
Have any way to do that or solution for this?
Or any document and link to read about wait in selenium and python?
I was try research but a lot of document about wait in selenium and python, I was confused about that.
Thanks you so much!