How to skip the erroring link in a python list

Question

I have a list of links that I'm trying to scrape the html text from. It's a long list (the list is titled annoying) and I appear to have some faulty links in my list. I'd like my code to ignore those links that produce an error and continue on down my list. I'm new to this, so any help is appreciated.

I attempted to use this answer catch specific HTTP error in python but I'm stuck on how to make my code move on to the next item in the list.

Here is my current code

maybe1=[]

from bs4 import BeautifulSoup
import urllib.request
import urllib

try:
    for i in annoying:
        resp=urllib.request.urlopen(i)
        soup=BeautifulSoup(resp, 'lxml').encode('utf-8')

        maybe1.append(soup)

except urllib.error.HTTPError as err:
    skip=True

Thanks much!

put the try-except *inside* the loop – Chris_Rands Jan 10 '19 at 15:29 — Chris_Rands, Jan 10 '19 at 15:29

score 1 · Answer 1 · answered Jan 10 '19 at 15:31

1

Just put try/except inside the loop

from bs4 import BeautifulSoup
import urllib.request
import urllib

annoying_links = ['link1', 'link2']
maybe1 = []
for link in annoying_links:
    try:
        resp=urllib.request.urlopen(i)
        soup=BeautifulSoup(resp, 'lxml').encode('utf-8')
        maybe1.append(soup)
    except urllib.error.HTTPError:
        print ('Skipped: ' + link)

answered Jan 10 '19 at 15:31

grapes

8,185
1
19
31

Thank you! And thank you for the addition of printing the skipped links. That will be helpful – kaci155 Jan 10 '19 at 15:40

How to skip the erroring link in a python list

1 Answers1