-1

I'm building a real state web-scraper and i'm having problems when a certain index doesn't exist in the html.

How can i fix this? The code that is having this trouble is this

info_extra = container.find_all('div', class_="info-right text-xs-right")[0].text

I'm new to web-scraping so I'm kinda lost.

Thanks!

  • 1
    Since you only need the first item, use `find` rather than `find_all`. It will just leave the value empty if nothing is found. – RJ Adriaansen Oct 05 '21 at 17:32

3 Answers3

0

I'm new to web-scraping too and most of my problems are when I ask for an element on the page that doesn't exist

Have you tried the Try/Except block?

try:
    info_extra = container.find_all('div', class_="info-right text-xs-right")[0].text
except Exception as e:
    raise

https://docs.python.org/3/tutorial/errors.html

Good luck

Paul
  • 1
  • 1
  • don't `raise` or your catch is useless, leave a message to know there's an error like `print('oops the result is empty')`. that way your script won't break, but `raise` will break it – diggusbickus Oct 05 '21 at 17:47
0

One general way is to check the length before you attempt to access the index.

divs = container.find_all('div', class_="info-right text-xs-right")
if len(divs) > 0:
   info_extra = divs[0].text
else:
   info_extra = None

You can simplify this further by knowing that an empty list is false.

divs = container.find_all('div', class_="info-right text-xs-right")
if divs:
   info_extra = divs[0].text
else:
   info_extra = None

You can simplify even further by using the walrus operator :=


if (divs := container.find_all('div', class_="info-right text-xs-right")):
   info_extra = divs[0].text
else:
   info_extra = None

Or all in one line:

info_extra = divs[0].text if (divs := container.find_all('div', class_="info-right text-xs-right") else None

Forensic_07
  • 1,125
  • 1
  • 6
  • 10
0

First of all, you should always check data before doing anything with it.
Now if there is just one result in site for your selector

info_extra_element = container.select_one('div.info-right.text-xs-right'
        )

if info_extra_element:
    info_extra = info_extra_element.text
else:

    # On unexpected situation where selector couldn't be found
    # report it and do something to prevent your program from crashing.

    print("selector couldn't be found on the page")
    info_extra = ''

If there are a list of elements that match your selector

info_extra_elements = container.select('div.info-right.text-xs-right'
        ).text
info_extra_texts = []

for element in info_extra_elements:
    info_extra_texts.append(element.text)

PS.
Based on this answer, It's a good practice to use a CSS selector when you want to filter based on class.
find method can be used when you just want to filter based on element tag.