How to get the text and URL from a link using beautifulsoup

Question

I have the following code which prints out a list of the links for each team in a table:

import requests
from bs4 import BeautifulSoup

# Get all teams in Big Sky standings table
URL = 'https://www.espn.com/college-football/standings/_/group/20/view/fcs-i-aa'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
standings = soup.find_all('table', 'Table Table--align-right Table--fixed Table--fixed-left')

for team in standings:
    team_data = team.find_all('span', 'hide-mobile')
    print(team_data)

The code prints out the entire list and if I pinpoint an index such as 'print(team_data[0])', it will print out the specific link from the page.

How can I then go into that link and get the string from the URL as well as the text for the link?

For example, my code prints out the following for the first index in the list.

<span class="hide-mobile"><a class="AnchorLink" data-clubhouse-uid="s:20~l:23~t:2692" href="/college-football/team/_/id/2692/weber-state-wildcats" tabindex="0">Weber State Wildcats</a></span>

How can I pull

/college-football/team/_/id/2692/weber-state-wildcats

and

Weber State Wildcats

from the link?

Thank you for your time and if there is anything I can add for clarification, please don't hesitate to ask.

score 3 · Accepted Answer · answered Feb 03 '20 at 19:07

Provided that you have an html like:

<span class="hide-mobile"><a class="AnchorLink" data-clubhouse-uid="s:20~l:23~t:2692" href="/college-football/team/_/id/2692/weber-state-wildcats" tabindex="0">Weber State Wildcats</a></span>

To get the /college-football/team/_/id/2692/weber-state-wildcats:

>>> team_data.find_all('a')[0]['href']
'/college-football/team/_/id/2692/weber-state-wildcats'

To get the Weber State Wildcats:

>>> team_data.find_all('a')[0].text
'Weber State Wildcats''

score 0 · Answer 2 · answered Feb 03 '20 at 18:54

0

In terms of the href/url, you can do something like this.

In regards to the link text, you could do something like this.

Both amount to filtering down to the target element, and then extracting the desired attribute.

answered Feb 03 '20 at 18:54

Greg

1,845
2
16
26

How to get the text and URL from a link using beautifulsoup

2 Answers2