0

I'm trying to get name and contact number from div. div sometimes has one span, sometimes two, and sometimes three. My expectation is that:

  • I only need name and contact number if available
  • In some cases, name will not be available and contact number will be available, then name variable should be assigned 'N/A'
  • In some cases, contact number will not be available and name will be available, then variable should be assigned 'N/A'

This is what I have so far:

// if you change url to url-1 and url-2 then you will see how it works.
url = "https://www.zillow.com/homedetails/19442-185th-Ave-SE-Renton-WA- 
98058/54831221_zpid/"
#url-1 = "https://www.zillow.com/homedetails/20713-61st-St-E-Bonney-Lake-WA-98391/99371104_zpid/"
#url-2 = "https://www.zillow.com/homes/fsbo/house_type/121319389_zpid/globalrelevanceex_sort/47.465758,-122.259207,47.404798,-122.398424_rect/12_zm/5f9305c92cX1-CRbri51bo8epha_yly1g_crid/0_mmm/"
browser = webdriver.Firefox()
browser.get(url)
time.sleep(5)

soup = bs4.BeautifulSoup(browser.page_source,'html.parser')

contacts = browser.find_elements_by_css_selector("span.listing-field")
contact_name = []
contact_phone = "N/A"
contact_web = "N/A"

for i in range(0, len(contacts)):
    if len(contacts[i].find_elements_by_tag_name("a")) > 0:
    contact_web = 
    contacts[i].find_element_by_tag_name("a").get_attribute("href")
    elif re.search("\\(\\d+\\)\\s+\\d+-\\d+", contacts[i].text):
        contact_phone = contacts[i].text
    else:
        contact_name.append(contacts[i].text)

print(contact_phone) // Output: (253) 335-8690
print(contact_name)  // Output: ['Sheetal Datta']

2 Answers2

1

Welcome to StackOverflow! You should approach this problem in a programmatic way, namely with conditions. As you already noted,

if the name exists and the contact number exists,
    use them
else if the name exists only,
    use the name and assign the contact number as 'N/A'
else if the contact number exists only,
    use the contact number and assign the name as 'N/A'

As you can see, you can implement the above pseudo-code as actual conditional statements in Python using if-elif-else statements. Depending on how the webpage is structured, you'll want to check for the existence of the span's first before you try to read values from them, which you can do following this SO post.

natn2323
  • 1,983
  • 1
  • 13
  • 30
0

You can use try: except: to check if contact name and phone number present or not and then assign value accordingly. See the code...

from bs4 import BeautifulSoup
from selenium import webdriver
import time

url = ('https://www.zillow.com/homedetails/19442-185th-Ave-SE-Renton-WA-'
'98058/54831221_zpid/')

browser = webdriver.Firefox()
browser.get(url)
time.sleep(5)
soup = BeautifulSoup(browser.page_source,'html.parser')
browser.quit()
tag = soup.find('div',attrs={
    'class':'home-details-listing-provided-by zsg-content-section'})

try:
    contact_name = tag.find('span',attrs={
        'class':'listing-field'}).text
except:
    contact_name = 'N/A'

try:
    contact_phone = tag.find('span',attrs={
        'class':'listing-field'}).findNext('span').text
except:
    contact_phone = 'N/A'


print('Contact Name: {}\nContact Phone: {}'.format(
    contact_name,contact_phone))

Output:

Contact Name: Sheetal Datta
Contact Phone: (253) 335-8690
Sohan Das
  • 1,560
  • 2
  • 15
  • 16