How to use BeautifulSoup to find an href link with a class

Question

<div data-pet-card="pet-card" class="pet-card">

    <a data-pet-card="pet-card-link" href="https://Link-I-Want.com" 
    class="pet-card__link">

I am used to scraping html with BS4 but I am not super familiar with html itself and haven't come across an href that also has a class and the data-pet-card="pet-card-link" thing. I tried:

for a in soup.find_all('a', href=True):
    print("Found the URL:", a['href'])

but it prints nothing, and gives no errors.

Anything is helpful, thank you.

Possible duplicate of [How to find elements by class](https://stackoverflow.com/questions/5041008/how-to-find-elements-by-class) — dmuensterer, Sep 14 '18 at 12:56
You don't need to care about the data attribute, just the class. — Daniel Roseman, Sep 14 '18 at 12:58
@Dominik No, not trying to find a class. Trying to get the href link, but it is surrounded by a class on the same line. Like I said I am familiar with BS4 and I would know how to find a simple class. Thank you — DevinGP, Sep 14 '18 at 12:59
No, you're trying to find the `a` tag with the class "pet-card__link". — Daniel Roseman, Sep 14 '18 at 13:00

score 6 · Answer 1 · answered Sep 14 '18 at 13:09

6

The attribute you put in the find_all call is the thing you have, not the thing you want to find. Here you have the class, so use that:

for a in soup.find_all('a', class_="pet-card__link"):
    print("Found the URL:", a['href'])

(Because class is a reserved word in Python, you need to use class_ here.)

answered Sep 14 '18 at 13:09

Daniel Roseman

588,541
66
880
895

Since this still printed nothing, there must be a problem with my request. Here is what I do and what it prints when I say print(response.text): `https://pastebin.com/FbJVnQUV` – DevinGP Sep 14 '18 at 13:22
That response doesn't contain any "pet-" divs at all. – Daniel Roseman Sep 14 '18 at 13:25
I know that is what I figured the problem is.. but when I go to that link and then Inspect a part of the page it shows up right there so I am not sure what is going wrong. – DevinGP Sep 14 '18 at 13:33
https://imgur.com/a/IhfxTuz You can see the specific line highlighted and the link itself in this screenshot – DevinGP Sep 14 '18 at 13:35

score 0 · Answer 2 · answered Sep 14 '18 at 13:01

0

for a in soup.find_all('a', href=True):
    print("Found the URL:", a.get_attribute_list('href')[0])

Please try this solution.

answered Sep 14 '18 at 13:01

yogkm

649
5
15

Unfortunately prints nothing still – DevinGP Sep 14 '18 at 13:04
1

Which version of BeautifulSoup are you using? – yogkm Sep 14 '18 at 13:13
I am using BeautifulSoup4 – DevinGP Sep 14 '18 at 13:18
1

@DevinGP I mean which version of BeautifulSoup4. – yogkm Sep 14 '18 at 20:07

How to use BeautifulSoup to find an href link with a class

2 Answers2