Python If String Contains in href

Question

This is my python code.

r = requests.get("myurl")
data = r.text
soup = BeautifulSoup(data, "lxml")
texttmp = ""
for link in soup.find_all('a'):
    image = link.get("href")
    if ".jpg" in image:
        print(image)

When I try to run this code, I am getting below error. How can I fix this?

TypeError                                 Traceback (most recent call last)
<ipython-input-35-618698d3a2d7> in <module>()
     11 for link in soup.find_all('a'):
     12     image = link.get("href")
---> 13     if ".jpg" in image:
     14         print(image)
     15 

TypeError: argument of type 'NoneType' is not iterable

Apparently `link.get('href') is None`. We'd need more information to tell you exactly why. — jonrsharpe, Aug 31 '18 at 14:34

score 4 · Answer 1 · answered Aug 31 '18 at 14:35

What it's telling you is that no href string could be found. Hence, you need to check for None before you see if the ".jpg" is in the image tag:

 if image and ".jpg" in image:

However, that's not the only thing going on. You're also trying to get from the found link nodes. You should check that the a has an attribute of href (some don't, see Bootstrap for examples!):

 for link in soup.find_all('a'):
   if link.has_attr('href'):
     #rest of code

See this SO post and others like it (I should have googled first, too.)

score 2 · Answer 2 · answered Aug 31 '18 at 14:40

In addition to representing links to other resources, html anchor tags <a ...> can also act as a named marker for a location in a document, so-called name tags <a name=whatever>, allowing the marked location to be the target of a link that uses a fragment in the URL http://example.com/#whatever

This is probably what you have run into, since name tags won't have href's to indicate a resource that they point to.

You'll need to check if the href returns None and skip that returned tag if it doesn't.

Good luck.

Python If String Contains in href

2 Answers2