5
<div data-pet-card="pet-card" class="pet-card">

    <a data-pet-card="pet-card-link" href="https://Link-I-Want.com" 
    class="pet-card__link">

I am used to scraping html with BS4 but I am not super familiar with html itself and haven't come across an href that also has a class and the data-pet-card="pet-card-link" thing. I tried:

for a in soup.find_all('a', href=True):
    print("Found the URL:", a['href'])

but it prints nothing, and gives no errors.

Anything is helpful, thank you.

DevinGP
  • 199
  • 2
  • 2
  • 10

2 Answers2

6

The attribute you put in the find_all call is the thing you have, not the thing you want to find. Here you have the class, so use that:

for a in soup.find_all('a', class_="pet-card__link"):
    print("Found the URL:", a['href']) 

(Because class is a reserved word in Python, you need to use class_ here.)

Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895
  • Since this still printed nothing, there must be a problem with my request. Here is what I do and what it prints when I say print(response.text): `https://pastebin.com/FbJVnQUV` – DevinGP Sep 14 '18 at 13:22
  • That response doesn't contain any "pet-" divs at all. – Daniel Roseman Sep 14 '18 at 13:25
  • I know that is what I figured the problem is.. but when I go to that link and then Inspect a part of the page it shows up right there so I am not sure what is going wrong. – DevinGP Sep 14 '18 at 13:33
  • https://imgur.com/a/IhfxTuz You can see the specific line highlighted and the link itself in this screenshot – DevinGP Sep 14 '18 at 13:35
0
for a in soup.find_all('a', href=True):
    print("Found the URL:", a.get_attribute_list('href')[0])

Please try this solution.

yogkm
  • 649
  • 5
  • 15