150

I am using BeautifulSoup and parsing some HTMLs.

I'm getting a certain data from each HTML (using for loop) and adding that data to a certain list.

The problem is, some of the HTMLs have different format (and they don't have the data that I want in them).

So, I was trying to use exception handling and add value null to the list (I should do this since the sequence of data is important.)

For instance, I have a code like:

soup = BeautifulSoup(links)
dlist = soup.findAll('dd', 'title')
# I'm trying to find content between <dd class='title'> and </dd>
gotdata = dlist[1]
# and what i want is the 2nd content of those
newlist.append(gotdata)
# and I add that to a newlist

and some of the links don't have any <dd class='title'>, so what I want to do is add string null to the list instead.

The error appears:

list index out of range.

What I have done tried is to add some lines like this:

if not dlist[1]:  
   newlist.append('null')
   continue

But it doesn't work out. It still shows error:

list index out of range.

What should I do about this? Should I use exception handling? or is there any easier way?

Any suggestions? Any help would be really great!

Adi
  • 5,089
  • 6
  • 33
  • 47
H.Choi
  • 3,095
  • 7
  • 26
  • 24

6 Answers6

347

Handling the exception is the way to go:

try:
    gotdata = dlist[1]
except IndexError:
    gotdata = 'null'

Of course you could also check the len() of dlist; but handling the exception is more intuitive.

ThiefMaster
  • 310,957
  • 84
  • 592
  • 636
47

You have two options; either handle the exception or test the length:

if len(dlist) > 1:
    newlist.append(dlist[1])
    continue

or

try:
    newlist.append(dlist[1])
except IndexError:
    pass
continue

Use the first if there often is no second item, the second if there sometimes is no second item.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
33

A ternary will suffice. change:

gotdata = dlist[1]

to

gotdata = dlist[1] if len(dlist) > 1 else 'null'

this is a shorter way of expressing

if len(dlist) > 1:
    gotdata = dlist[1]
else: 
    gotdata = 'null'
Ryan Haining
  • 35,360
  • 15
  • 114
  • 174
5

For anyone interested in a shorter way:

gotdata = len(dlist)>1 and dlist[1] or 'null'

But for best performance, I suggest using False instead of 'null', then a one line test will suffice:

gotdata = len(dlist)>1 and dlist[1]
Lorraine
  • 1,189
  • 14
  • 30
Benamar
  • 845
  • 8
  • 3
  • 1
    This style was common before python introduced conditional expressions (`v = a if condition else b`), but has been out of style for many years. – Ryan Haining Oct 22 '21 at 16:37
3

Taking reference of ThiefMaster♦ sometimes we get an error with value given as '\n' or null and perform for that required to handle ValueError:

Handling the exception is the way to go

try:
    gotdata = dlist[1]
except (IndexError, ValueError):
    gotdata = 'null'
alecbz
  • 6,292
  • 4
  • 30
  • 50
0
for i in range (1, len(list))
    try:
        print (list[i])

    except ValueError:
        print("Error Value.")
    except indexError:
        print("Erorr index")
    except :
        print('error ')
taskinoor
  • 45,586
  • 12
  • 116
  • 142
Gouled Med
  • 23
  • 2