1

From the follwing link I want to be able to scrape the data. However, when I am using Beautiful Soup I could not locate it in the html and Beautiful soup was not working. Furthermore, I thought maybe I can use selenium to scrape this data, but I cannot locate this content either. Do you know how I would use selenium or Beautiful Soup to get the Zestimate of "This home" for January of every year from 2015-2020? Thanks for your help in advance. I am using Python.

https://www.zillow.com/homedetails/1954-Sandy-Point-Ln-Mount-Pleasant-SC-29466/10938706_zpid/

enter image description here

Neel Mehta
  • 67
  • 6

2 Answers2

0

Try the below code, it will give the Zestimate for the home.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import time

options = Options()
user_agent = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.50 Safari/537.36'
options.add_argument('user-agent={0}'.format(user_agent))

driver = webdriver.Chrome(options=options)
wait = WebDriverWait(driver, 20)
action = ActionChains(driver)

driver.get("https://www.zillow.com/homedetails/1954-Sandy-Point-Ln-Mount-Pleasant-SC-29466/10938706_zpid/")

Home_Value = wait.until(EC.presence_of_element_located((By.XPATH, "//a[text()='Home value']")))
action.move_to_element(Home_Value).click().perform()

Zestimate = driver.find_element_by_xpath('//*[@id="ds-home-values"]/div/div[1]/div/div[1]/div/div/p').text

print(Zestimate)

Regarding - "January of every year from 2015-2020?" - You can run the same script manually in Jan to get a latest Zestimate. You can also create a cron job. But I am not sure how to do that.

P.S - After running this script for about 3-4 times i am now facing a CAPTCHA. There is a good explanation available on THIS link

Swaroop Humane
  • 1,770
  • 1
  • 7
  • 17
  • I am also getting CAPTCHA. Even after reading the link I still am not able to get rid of CAPTCHA. Do you know how to bypass CAPTCHA in this case? @Swaroop Humane – Neel Mehta Aug 17 '20 at 16:54
0

To extract the Zestimate i.e. Zestimate®: $4,232,581 you have to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:

  • Using XPATH:

    driver.get('https://www.zillow.com/homedetails/1954-Sandy-Point-Ln-Mount-Pleasant-SC-29466/10938706_zpid/')
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[contains(., 'For sale')]//following::span[contains(@class, 'ds-dashed-underline') and contains(., 'Zestimate')]//ancestor::span[2]"))).text)
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Thanks for your help. However, I want the Zestimate from the graph for the years between 2015- 2020, not the current Zestimate. For example in the image I provided, for January 2015, the number should be $3.2 . Do you know how to get this data? – Neel Mehta Aug 17 '20 at 16:59