soup = BeautifulSoup(open(filename), "lxml")
dd = {}
rows = soup.find_all('tr')
for r in rows:
td = r.find_all('td')
print(td)
Using the code above, printing the td
would give me the following:
[<td class="num cell-icon-string" data-sort-value="1"><i class="pki pkiAll n1" data-sprite="pkiAll n1"></i> 001</td>, <td class="cell-icon-string"><a class="ent-name" href="/pokedex/bulbasaur" title="View pokedex for #001 Bulbasaur">Bulbasaur</a></td>, <td class="cell-icon"><a class="type-icon type-grass" href="/type/grass">Grass</a><br/><a class="type-icon type-poison" href="/type/poison">Poison</a></td>, <td class="num-total">318</td>, <td class="num">45</td>, <td class="num">49</td>, <td class="num">49</td>, <td class="num">65</td>, <td class="num">65</td>, <td class="num">45</td>]
[<td class="num cell-icon-string" data-sort-value="2"><i class="pki pkiAll n2" data-sprite="pkiAll n2"></i> 002</td>, <td class="cell-icon-string"><a class="ent-name" href="/pokedex/ivysaur" title="View pokedex for #002 Ivysaur">Ivysaur</a></td>, <td class="cell-icon"><a class="type-icon type-grass" href="/type/grass">Grass</a><br/><a class="type-icon type-poison" href="/type/poison">Poison</a></td>, <td class="num-total">405</td>, <td class="num">60</td>, <td class="num">62</td>, <td class="num">63</td>, <td class="num">80</td>, <td class="num">80</td>, <td class="num">60</td>]
From this td
, I would specifically like to get the type name and the name:
<a class="type-icon type-grass" href="/type/grass">Grass</a><br/>
and
<a class="ent-name" href="/pokedex/bulbasaur" title="View pokedex for #001 Bulbasaur">Bulbasaur</a></td>
But I am having a hard time accessing those specific elements.
How can I do this with beautifulsoup?