I recently started working on a program in python which allows the user to conjugate any verb easily. To do this, I am using the urllib module to open the corresponding conjugations web page. For example, the verb "beber" would have the web page:
To open the page, I use the following python code:
source = urllib.urlopen("http://wwww.spanishdict.com/conjugate/beber").read()
This source does contain the information that I want to parse. But, when I make a BeautifulSoup object out of it like this:
soup = BeautifulSoup(source)
I appear to lose all the information I want to parse. The information lost when making the BeautifulSoup object usually looks something like this:
<tr>
<td class="verb-pronoun-row">
yo </td>
<td class="">
bebo </td>
<td class="">
bebí </td>
<td class="">
bebía </td>
<td class="">
bebería </td>
<td class="">
beberé </td>
</tr>
What am I doing wrong? I am no professional at Python or Web Parsing in general, so it may be a simple problem.
Here is my complete code (I used the "++++++" to differentiate the two):
import urllib
from bs4 import BeautifulSoup
source = urllib.urlopen("http://www.spanishdict.com/conjugate/beber").read()
soup = BeautifulSoup(source)
print source
print "+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++"
print str(soup)