0

I would like to know how I can get the href by having BeautifulSoup 4 search for a particular string associated with the href. For example, in the outer HTML code below, the href corresponds to the text All sizes. How can I have Python see where All sizes is located then grab the corresponding href?

<div class="O1id0e">Find other sizes of this image:<br><span><a href="/search?sa=G&amp;hl=en&amp;tbs=simg:CAQShgIJUqEkPJw64xwa-gELELCMpwgaOQo3CAQSExqjM9UVvRSjObMZlBmHL-Y_1ii4aGhmZ237LHf7-juMv8evKr9s7Fxc5QIrxHCDAIAUwBAwLEI6u_1ggaCgoICAESBM9Xj5QMCxCd7cEJGpsBChsKCHJlbGlnaW9u2qWI9gMLCgkvai80YjYza3QKGwoIbGFuZ3VhZ2XapYj2AwsKCS9qLzJzaF95NAobCghmaXJlYXJtc9qliPYDCwoJL2ovMTFuaG43ChoKB3dlYXBvbnPapYj2AwsKCS9qLzg2aG15bQomChNvdGhlciBzbWFsbCB3ZWFwb25z2qWI9gMLCgkvai9mcWQ1NDkM&amp;q=tela+preta&amp;tbm=isch&amp;ved=2ahUKEwio56L8_L_2AhXgB50JHTmWBhIQ2A4oAXoECAEQNA">All sizes</a></span><span>&nbsp;-&nbsp;<a href="/search?sa=G&amp;hl=en&amp;tbs=simg:CAQShgIJUqEkPJw64xwa-gELELCMpwgaOQo3CAQSExqjM9UVvRSjObMZlBmHL-Y_1ii4aGhmZ237LHf7-juMv8evKr9s7Fxc5QIrxHCDAIAUwBAwLEI6u_1ggaCgoICAESBM9Xj5QMCxCd7cEJGpsBChsKCHJlbGlnaW9u2qWI9gMLCgkvai80YjYza3QKGwoIbGFuZ3VhZ2XapYj2AwsKCS9qLzJzaF95NAobCghmaXJlYXJtc9qliPYDCwoJL2ovMTFuaG43ChoKB3dlYXBvbnPapYj2AwsKCS9qLzg2aG15bQomChNvdGhlciBzbWFsbCB3ZWFwb25z2qWI9gMLCgkvai9mcWQ1NDkM,isz:s&amp;q=tela+preta&amp;tbm=isch&amp;ved=2ahUKEwio56L8_L_2AhXgB50JHTmWBhIQ2A4oAnoECAEQNQ">Small</a></span><span>&nbsp;-&nbsp;<a href="/search?sa=G&amp;hl=en&amp;tbs=simg:CAQShgIJUqEkPJw64xwa-gELELCMpwgaOQo3CAQSExqjM9UVvRSjObMZlBmHL-Y_1ii4aGhmZ237LHf7-juMv8evKr9s7Fxc5QIrxHCDAIAUwBAwLEI6u_1ggaCgoICAESBM9Xj5QMCxCd7cEJGpsBChsKCHJlbGlnaW9u2qWI9gMLCgkvai80YjYza3QKGwoIbGFuZ3VhZ2XapYj2AwsKCS9qLzJzaF95NAobCghmaXJlYXJtc9qliPYDCwoJL2ovMTFuaG43ChoKB3dlYXBvbnPapYj2AwsKCS9qLzg2aG15bQomChNvdGhlciBzbWFsbCB3ZWFwb25z2qWI9gMLCgkvai9mcWQ1NDkM,isz:m&amp;q=tela+preta&amp;tbm=isch&amp;ved=2ahUKEwio56L8_L_2AhXgB50JHTmWBhIQ2A4oA3oECAEQNg">Medium</a></span><span>&nbsp;-&nbsp;<a href="/search?sa=G&amp;hl=en&amp;tbs=simg:CAQShgIJUqEkPJw64xwa-gELELCMpwgaOQo3CAQSExqjM9UVvRSjObMZlBmHL-Y_1ii4aGhmZ237LHf7-juMv8evKr9s7Fxc5QIrxHCDAIAUwBAwLEI6u_1ggaCgoICAESBM9Xj5QMCxCd7cEJGpsBChsKCHJlbGlnaW9u2qWI9gMLCgkvai80YjYza3QKGwoIbGFuZ3VhZ2XapYj2AwsKCS9qLzJzaF95NAobCghmaXJlYXJtc9qliPYDCwoJL2ovMTFuaG43ChoKB3dlYXBvbnPapYj2AwsKCS9qLzg2aG15bQomChNvdGhlciBzbWFsbCB3ZWFwb25z2qWI9gMLCgkvai9mcWQ1NDkM,isz:l&amp;q=tela+preta&amp;tbm=isch&amp;ved=2ahUKEwio56L8_L_2AhXgB50JHTmWBhIQ2A4oBHoECAEQNw">Large</a></span></div>

1 Answers1

0

You can filter on text attribute:

soup.find('a', text='All sizes') # find <a> tag whose text is 'All sizes'

If your logic gets more complex, a custom finding function is a good way to implement complex rules.

def interesting_tags(tag):
    """
    custom finding functions take a tag as an argument.
    return True if the tag should be included in the find result
    """
    if tag.name == 'a' and tag.text == 'All sizes':
        return True

soup.find_all(interesting_tags)
sytech
  • 29,298
  • 3
  • 45
  • 86