0

In the frist. I very want to thanks @cruisepandey was help me in this topic: How to crawl question and answer of Google People Also Ask with Selenium and Python?

So I was used his code like this:

    driver = webdriver.Chrome(driver_path)
    driver.maximize_window()
    driver.implicitly_wait(30)
    wait = WebDriverWait(driver, 30)
    
    driver.get("https://www.google.com/search?q=How%20to%20make%20bakery%3F&source=hp&ei=j0aZYYjRAvja2roPrcWcyAU&iflsig=ALs-wAMAAAAAYZlUn4NMUPjfIpQmrXSmjIDnaWjJXWIJ&ved=0ahUKEwjI1JDn0Kf0AhV4rVYBHa0iB1kQ4dUDCAc&uact=5&oq=How%20to%20make%20bakery%3F&gs_lcp=Cgdnd3Mtd2l6EAMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBNQAFgAYJMDaABwAHgAgAF-iAF-kgEDMC4xmAEAoAECoAEB&sclient=gws-wiz")
    
    all_questions = driver.find_elements(By.XPATH, "//span[text()='People also ask']/../following-sibling::div/descendant::div[@data-hveid and @class and @jsname and @data-ved]")
    print(len(all_questions))
    
    j = 1
    for question in all_questions:
        time.sleep(1)
        ele = driver.find_element(By.XPATH, f"(//span[text()='People also ask']/../following-sibling::div/descendant::div[@data-hveid and @class and @jsname and @data-ved])[{j}]")
        j = j + 2
        ele.click()
        time.sleep(1)
        answer = ele.find_element(By.XPATH, ".//../following-sibling::div").get_attribute('innerText')
        print(answer)
        print('--------------')

This code very helpful. But I want to ask two question.

  1. When click to show answer. If not use time.sleep(1), I will use wait until answer show, how to get exactly class and code to wait answer show?
  2. Code will have problem if internet slowly. When click more result will not displays. I was try used wait until invisibility_of_element_located to load icon. But I catch Xpath not right. Have any way to do that? This is my code update from code of @cruisepandey:
    driver = webdriver.Chrome(driver_path)
    driver.maximize_window()
    driver.implicitly_wait(30)
    wait = WebDriverWait(driver, 30)
    
    driver.get("https://www.google.com/search?q=How%20to%20make%20bakery%3F&source=hp&ei=j0aZYYjRAvja2roPrcWcyAU&iflsig=ALs-wAMAAAAAYZlUn4NMUPjfIpQmrXSmjIDnaWjJXWIJ&ved=0ahUKEwjI1JDn0Kf0AhV4rVYBHa0iB1kQ4dUDCAc&uact=5&oq=How%20to%20make%20bakery%3F&gs_lcp=Cgdnd3Mtd2l6EAMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBNQAFgAYJMDaABwAHgAgAF-iAF-kgEDMC4xmAEAoAECoAEB&sclient=gws-wiz")
    
    all_questions = driver.find_elements(By.XPATH, "//span[text()='People also ask']/../following-sibling::div/descendant::div[@data-hveid and @class and @jsname and @data-ved]")
    print(len(all_questions))
    
    j = 1
    for question in all_questions:
        time.sleep(1)
        ele = driver.find_element(By.XPATH, f"(//span[text()='People also ask']/../following-sibling::div/descendant::div[@data-hveid and @class and @jsname and @data-ved])[{j}]")
        j = j + 2
        ele.click()
    
        # Question 1: To waiting answer show.
        timeout  = 30
        answer_css_class = 'wWOJcd'
        try:
    
            is_question_show = WebDriverWait(driver, timeout).until(
                EC.presence_of_element_located((By.CLASS_NAME, answer_css_class))
            )
        except TimeoutException:
            pass
    
        time.sleep(1)
        answer = ele.find_element(By.XPATH, ".//../following-sibling::div").get_attribute('innerText')
    
        # Question 2: To waiting G loading icon when click more answer with slow internet.
        loading_element_xpath = '/html/body/div[7]/div/div[9]/div[1]/div/div[2]/div[2]/div/div/div[2]/div/div/div[1]/g-loading-icon'
        try:
    
            is_question_show = WebDriverWait(driver, timeout).until(
                EC.presence_of_element_located((By.XPATH, loading_element_xpath))
            )
        except TimeoutException:
            pass
    
        print(answer)
        print('--------------')

So have any way to not use time.sleep and use wait until in selenium?

rdas
  • 20,604
  • 6
  • 33
  • 46

1 Answers1

1

I do not think you need invisibility_of_element_located to load an icon.

You can have a separate try-except for question and answer.

Code :

driver = webdriver.Chrome(driver_path)
driver.maximize_window()
wait = WebDriverWait(driver, 30)


driver.get("https://www.google.com/search?q=How%20to%20make%20bakery%3F&source=hp&ei=j0aZYYjRAvja2roPrcWcyAU&iflsig=ALs-wAMAAAAAYZlUn4NMUPjfIpQmrXSmjIDnaWjJXWIJ&ved=0ahUKEwjI1JDn0Kf0AhV4rVYBHa0iB1kQ4dUDCAc&uact=5&oq=How%20to%20make%20bakery%3F&gs_lcp=Cgdnd3Mtd2l6EAMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBNQAFgAYJMDaABwAHgAgAF-iAF-kgEDMC4xmAEAoAECoAEB&sclient=gws-wiz")

all_questions = driver.find_elements(By.XPATH, "//span[text()='People also ask']/../following-sibling::div/descendant::div[@data-hveid and @class and @jsname and @data-ved]")
print(len(all_questions))

j = 1
for question in all_questions:
    try:
        ele = wait.until(EC.visibility_of_element_located((By.XPATH, f"(//span[text()='People also ask']/../following-sibling::div/descendant::div[@data-hveid and @class and @jsname and @data-ved])[{j}]")))
        ele.click()
        j = j + 2
    except:
        print("Could not click on Question link")
    try:
        answer = ele.find_element(By.XPATH, ".//../following-sibling::div").get_attribute('innerText')
        print(answer)
    except:
        print("Could not read answer.")

Imports:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Output:

6
Follow the below-mentioned steps to open a successful bakery business in India in 2021:
Create A Bakery Business Plan. ...
Choose A Location For Your Bakery Business. ...
Get All Licenses Required To Open A Bakery Business In India. ...
Get Manpower Required To Open A Bakery. ...
Buy Equipment Needed To Start A Bakery Business.
More items...

A Detailed Guide On How To Start A Bakery Business In India
https://www.posist.com › Home › Resources
Search for: How do I start my own bakery?
The most profitable bakeries have a gross profit margin of 9%, while the average is much lower at 4%. The growth of profitable bakeries can be as high as 20% year over year. While a large number of bakeries never reach the break-even, a handful of them can even have a net profit margin as high as 12%.06-Jul-2020

How Much Do Bakery Owners Make? | Restaurant Accounting
https://restaurantaccounting.net › how-much-do-bakery-o...
Search for: Do bakeries make money?
Home Bakery Business - How to Start
Decide on the goods to bake. ...
Plan your kitchen space. ...
Get a permit. ...
Talk to a tax agent. ...
Set appropriate prices. ...
Start baking and selling.
16-Feb-2015

Home Bakery Business - How to Start | BakeCalc
http://www.bakecalc.com › blog › how-to-start-a-home-b...
Search for: How do I start a small baking business from home?


The success of any bakery, whether a home-based or commercial operation, hinges largely on the quality of the products. ... Creating a niche for your bakery, such as stunning cakes or unusual pastries, can help set you apart and build a loyal customer base.

What Makes a Home Bakery Business Successful?
https://smallbusiness.chron.com › home-bakery-business-s...
Search for: What makes a bakery successful?

Process finished with exit code 0
cruisepandey
  • 28,520
  • 6
  • 20
  • 38
  • 1
    Yes! Thanks @cruisepandey. I will try that and research more about visibility_of_element_located. Thanks you so much. Have any post or document to read about catch Xpath? I see you catch Xpath too profesional – nguyenphanhoaiduc Nov 21 '21 at 12:19
  • Please use this link to learn more about XPath https://www.w3schools.com/xml/xpath_axes.asp – cruisepandey Nov 21 '21 at 12:21