I am using BeautifulSoup and parsing some HTMLs.
I'm getting a certain data from each HTML (using for loop) and adding that data to a certain list.
The problem is, some of the HTMLs have different format (and they don't have the data that I want in them).
So, I was trying to use exception handling and add value null
to the list (I should do this since the sequence of data is important.)
For instance, I have a code like:
soup = BeautifulSoup(links)
dlist = soup.findAll('dd', 'title')
# I'm trying to find content between <dd class='title'> and </dd>
gotdata = dlist[1]
# and what i want is the 2nd content of those
newlist.append(gotdata)
# and I add that to a newlist
and some of the links don't have any <dd class='title'>
, so what I want to do is add string null
to the list instead.
The error appears:
list index out of range.
What I have done tried is to add some lines like this:
if not dlist[1]:
newlist.append('null')
continue
But it doesn't work out. It still shows error:
list index out of range.
What should I do about this? Should I use exception handling? or is there any easier way?
Any suggestions? Any help would be really great!