0

I'm trying to scrape the data information from a website. The html structure is like that:

<tbody>
    <tr id="city_1">
        <td class="first"><a href="http://www.link_1.com/" class="text" target="_blank">Name_1</a></td>
        <td style="text-align: right;"><span class="text">247 380</span></td>
        <td class="hidden-xs"><span class="text">NRW</span></td>
        <td class="hidden-xs last"><span class="text">52062</span></td>
    </tr>
    <tr id="city_1">
        <td class="first"><a href="http://www.link_2.com/" class="text" target="_blank">Name_2</a></td>
        <td style="text-align: right;"><span class="text">247 380</span></td>
        <td class="hidden-xs"><span class="text">NRW</span></td>
        <td class="hidden-xs last"><span class="text">52062</span></td>
    </tr>
</tbody>

I created a nested loop in python with beautiful soup package to access the hyperlink in which is store the information that I need (the link and the name).

Below my code:

import pandas as pd
import requests
from bs4 import BeautifulSoup
#get all the city links of the page
page = requests.get("link")
#print(page)
soup = BeautifulSoup(page.content, "html.parser")
#print(soup)

for x in soup.tbody:
    for y in x:
        for z in y:
            print(z.find('a')) #here the problem.

I don't know how to get the href and the name with soup for every hyperlinks of the list.

UgoL
  • 839
  • 2
  • 13
  • 37
  • Does this answer your question? [retrieve links from web page using python and BeautifulSoup](https://stackoverflow.com/questions/1080411/retrieve-links-from-web-page-using-python-and-beautifulsoup) – Zaraki Kenpachi Feb 27 '20 at 11:56

1 Answers1

1

Try this:

for x in soup.tbody.find_all('td',class_='first'):    
    print(x.find('a').get('href'),x.text)

Output:

http://www.aachen.de/ Aachen
http://www.aalen.de/ Aalen
http://www.amberg.de/ Amberg

etc.

Jack Fleeting
  • 24,385
  • 6
  • 23
  • 45