0

I'm still new to Python and I wrote some code to help me go through some online listings.

I had to get some error handling in there as when an attribute of the listing isn't found, it would crash the program.

If I try to use pass or continue I just get stuck in an infinite loop, as expected.

I feel like I've written myself into a corner and I just can't seem to find a solution. The solutions I found I could not figure out and most were for other languages.

How can I make this work so that the loop doesn't skip over all of the other attributes once an error is found?

EDIT: I think my post was unclear on that point, my apologies. What happens is this: If the element of interest in a listing is not found, the other elements are skipped over. So if the listing has no owner name specified (the first element or attribute), the whole listing gets ignored. It continues on to the next listing. Any idea how I could fix that?

Here's the part of the code:

#iterate through the results according to user input earlier
i = 0
while (i < number_of_results):

    Result_page = driver.current_url
#define elements of the listing of interest
    stran = requests.get(driver.current_url)
    soup = BeautifulSoup(stran.content, 'html.parser')
    ime = soup.find("dd", itemprop="name")
    ulica = soup.find("dd", itemprop="streetAddress")
    postna_stevilka = soup.find("span", itemprop="postalCode")
    kraj = soup.find("span", itemprop="addressLocality")
    tel = soup.find("dd", itemprop="telephone")
    spletna_stran = soup.find("dd", itemprop="url") 
    mobil = soup.find("a", itemprop="telephone")
    
    try:
        print(ime.text)
        c1 = sheet.cell(row=i+1, column=1)
        c1.value = ime.text
        print(ulica.text)
        c1 = sheet.cell(row=i+1, column=2)
        c1.value = ulica.text
        print(postna_stevilka.text)
        c1 = sheet.cell(row=i+1, column=3)
        c1.value = postna_stevilka.text
        print(kraj.text)
        c1 = sheet.cell(row=i+1, column=4)
        c1.value = kraj.text
        print(tel.text)
        c1 = sheet.cell(row=i+1, column=5)
        c1.value = tel.text
#print(mobil.text) does not work, cut out to prevent error
        print(spletna_stran.text)
        c1 = sheet.cell(row=i+1, column=6)
        c1.value = spletna_stran.text
        
        
#catch the error when an entry isn't there      
    except AttributeError:
        print("No such entry.")
    
       
        

    next_entry = driver.find_element_by_xpath("/html/body/main/chousingdetail/div/div[2]/div[1]/nav/div/div[2]/a[2]/i")
    next_entry.click()
    i +=1
Coercer_
  • 39
  • 7
  • 1
    i think **continue** should be after **except** and after **finally** there shod be a function which will be executed regardless of the exception https://stackoverflow.com/questions/10544928/python-using-continue-in-a-try-finally-statement-in-a-loop#:~:text=If%20a%20continue%20statement%20is,on%20to%20the%20next%20iteration. –  Nov 04 '20 at 14:00
  • 1
    finally block runs no matter what and in your finally block you have `continue` which doesn't let the code below it run. so i would suggest you to move all the blocks below `continue` to finally block. it should work – Amit Kumar Nov 04 '20 at 14:13
  • 1
    or remove `continue` from the `finally` block – Amit Kumar Nov 04 '20 at 14:17
  • @AmitKumar My original code did not have either 'continue' or the 'finally' clauses, those were my attempts to get the loop to return to the attributes and extract the other information. I think my post was unclear on that point, my apologies. What happens is this: If the element of interest in a listing is not found, the other elements are skipped over. So if the listing has no owner name specified (the first element or attribute), the whole listing gets ignored. It continues on to the next listing. Any idea how I could fix that? – Coercer_ Nov 05 '20 at 07:19
  • @CYREX Thank you for the link to that thread! I think I was unclear on my problem, I've edited my post accordingly. My code originally did not have those clauses. – Coercer_ Nov 05 '20 at 07:22

2 Answers2

1

If I understand correctly what you're trying to do, you should not be using try...except like that.

As soon as try block encounters an exception, it jumps into the except block. It will not "try" the rest of the lines. So, if you want all the elements to be checked regardless of any one of them failing, you'd need to put each one of them in separate try...except blocks. For example,

try:
    print(ime.text)
    c1 = sheet.cell(row=i+1, column=1)
    c1.value = ime.text
except:
    pass

try:
    print(ulica.text)
    c1 = sheet.cell(row=i+1, column=2)
    c1.value = ulica.text
except:
    pass

and so on. This way, a missing value will be handled, and the script will just move to the next element.

However, here's how I'd prefer to do it: because bs4.BeautifulSoup.find() returns None if it doesn't find anything, you could use:

ime = soup.find("dd", itemprop="name")
if ime:
    print(ime.text)
    c1 = sheet.cell(row=i+1, column=1)
    c1.value = ime.text

and so on. I'd even wrap those lines up in a function since they're almost the same for each element. (In fact, there are a few improvements I could suggest to your code, but maybe that's for another discussion; I'll stick to the question for now!)

Ratler
  • 431
  • 3
  • 14
  • Thank you. Ugh, the solution is so obvious now. I guess I was too focused on keeping as much of my code the way it is as possible. Your preferred way to do it looks very slick. For some reason I completely forgot about `if`. I'll try this out right now. :) – Coercer_ Nov 05 '20 at 08:05
  • I've just used your method and it works great! I've marked this as the answer as it solves my problem. I'm sorry to ask this of you, but I would also be very grateful for any suggestions to my code as I'm very clumsy with Python... – Coercer_ Nov 05 '20 at 08:20
  • 1
    @Coercer_, this is not the best forum for this, but briefly: 1) wrap up repeated steps in a function to reduce the code and make it easier to maintain; e.g., `get_item_value(col)`. 2) Where does `number_of_results` come from? Consider ending the loop when "next" can't be found; makes code more robust. 3) Looks like you want a table. A natural structure for that is a list `[row_1 ... row_n]`, where `row_i = { ime_i.attrs["itemprop"]: ime_i.text, ... etc.}`. That list can readily be exported into a CSV/spreadsheet with csv.DictWriter. – Ratler Nov 05 '20 at 09:19
  • Thank you once again. The `number_of_results` comes from code before this part of the code. The user is asked for a postal code and number of listings to go through. I didn't post all of it as I thought it wise to focus on the parts giving me trouble. Then the code runs a search on the site in question. Yeah, I'm building a table with the relevant info from the listings. I'm using openpyxl to write the info into cells and then save a workbook. – Coercer_ Nov 05 '20 at 11:14
-1

Copy paste the last three lines inside the finally statement.

Shadowcoder
  • 962
  • 1
  • 6
  • 18
  • Thank you for your answer. I tried it and it does work. However, it does not solve my problem with the attributes getting skipped over as soon as an attribute is not found. I've edited my post to clarify. – Coercer_ Nov 05 '20 at 07:23