How do I scrape the link from an string in one line?

Question

I'm working on a web scraper and it has many different variables so keeping each variable to a single line is important to me. The current variable I am working on I have down to this:

<a href="http://website.com/example/123" target="_blank">Example</a>

Is there any simple way I can simply get the website (http://website.com/example/123 in this case) scrapped out in one line of code?

I'm currently using urllib, re, and BeautifulSoup so any of those libraries are fine. I tried adding

.find('a', attrs={'href': re.compile("^http://")})

to the end of my line, but it made the output return nothing.

score 2 · Accepted Answer · edited May 23 '17 at 12:28

2

I believe all you have to do is yourVarName['href']:

from bs4 import BeautifulSoup

html = '''<a href="http://website.com/example/123" target="_blank">Example</a>'''

soup = BeautifulSoup(html)

for a in soup.find_all('a', href=True):
    print "Found the URL:", a['href']

Found the URL: http://website.com/example/123

https://stackoverflow.com/a/5815888/3920284

edited May 23 '17 at 12:28

Community

1
1

answered Mar 03 '15 at 00:09

jeremy

307
4
15

I couldn't have asked for a better answer, thank you! – Vale Mar 03 '15 at 00:13
or, alternatively `soup.select('a[href]')`. – alecxe Mar 03 '15 at 01:32

How do I scrape the link from an string in one line?

1 Answers1