2

This is my python code.

r = requests.get("myurl")
data = r.text
soup = BeautifulSoup(data, "lxml")
texttmp = ""
for link in soup.find_all('a'):
    image = link.get("href")
    if ".jpg" in image:
        print(image)

When I try to run this code, I am getting below error. How can I fix this?

TypeError                                 Traceback (most recent call last)
<ipython-input-35-618698d3a2d7> in <module>()
     11 for link in soup.find_all('a'):
     12     image = link.get("href")
---> 13     if ".jpg" in image:
     14         print(image)
     15 

TypeError: argument of type 'NoneType' is not iterable
Smith Dwayne
  • 2,675
  • 8
  • 46
  • 75

2 Answers2

4

What it's telling you is that no href string could be found. Hence, you need to check for None before you see if the ".jpg" is in the image tag:

 if image and ".jpg" in image:

However, that's not the only thing going on. You're also trying to get from the found link nodes. You should check that the a has an attribute of href (some don't, see Bootstrap for examples!):

 for link in soup.find_all('a'):
   if link.has_attr('href'):
     #rest of code

See this SO post and others like it (I should have googled first, too.)

wheaties
  • 35,646
  • 15
  • 94
  • 131
2

In addition to representing links to other resources, html anchor tags <a ...> can also act as a named marker for a location in a document, so-called name tags <a name=whatever>, allowing the marked location to be the target of a link that uses a fragment in the URL http://example.com/#whatever

This is probably what you have run into, since name tags won't have href's to indicate a resource that they point to.

You'll need to check if the href returns None and skip that returned tag if it doesn't.

Good luck.

Michael Speer
  • 4,656
  • 2
  • 19
  • 10