I'm trying to scrape the data information from a website. The html structure is like that:
<tbody>
<tr id="city_1">
<td class="first"><a href="http://www.link_1.com/" class="text" target="_blank">Name_1</a></td>
<td style="text-align: right;"><span class="text">247 380</span></td>
<td class="hidden-xs"><span class="text">NRW</span></td>
<td class="hidden-xs last"><span class="text">52062</span></td>
</tr>
<tr id="city_1">
<td class="first"><a href="http://www.link_2.com/" class="text" target="_blank">Name_2</a></td>
<td style="text-align: right;"><span class="text">247 380</span></td>
<td class="hidden-xs"><span class="text">NRW</span></td>
<td class="hidden-xs last"><span class="text">52062</span></td>
</tr>
</tbody>
I created a nested loop in python with beautiful soup package to access the hyperlink in which is store the information that I need (the link and the name).
Below my code:
import pandas as pd
import requests
from bs4 import BeautifulSoup
#get all the city links of the page
page = requests.get("link")
#print(page)
soup = BeautifulSoup(page.content, "html.parser")
#print(soup)
for x in soup.tbody:
for y in x:
for z in y:
print(z.find('a')) #here the problem.
I don't know how to get the href and the name with soup for every hyperlinks of the list.