Python Beautifulsoup getting the hyperlink

Question

<li><a href="https://en.wikipedia.org/wiki/yyyyy.html" title="yyyyy">yyyyyy</a></li>

I am trying to scrape the data and was able to do using the beautifulsoup.

code I am using is :

for ul in soup.findAll('div'):
    print(ul.text)
    for li in ul.findAll('li'):
        print(li.text)
        f.write("li   "+str(li.text))

How can I get the href. I am looking output as: yyyyy;https://en.wikipedia.org/wiki/yyyyy.html

Possible duplicate of [retrieve links from web page using python and BeautifulSoup](https://stackoverflow.com/questions/1080411/retrieve-links-from-web-page-using-python-and-beautifulsoup) — Jack Moody, Apr 10 '19 at 20:57

score 0 · Accepted Answer · answered Apr 10 '19 at 20:58

0

You may want to try Tag.find():

f.write("li   "+li.find('a')['href'])

answered Apr 10 '19 at 20:58

Thank you I tried on this https://en.wikipedia.org/wiki/Lists_of_tourist_attractions but not getting the href am I missinganythibg .. – ML Learner2 Apr 10 '19 at 21:14
check your `HTML` contents, why are you searching for `
` tags? there are no `
` tags in the `HTML` code..
– chickity china chinese chicken Apr 10 '19 at 22:10
Otherwise, I'm not sure why you don't get the `'href'`, try demo here, click `'Run'` on top of page: http://repl.it/@downshift/FlippantLeadingIntranet -- Afterwards, the `'href.txt'` file will contain `li https://en.wikipedia.org/wiki/Lists_of_tourist_attractions` – chickity china chinese chicken Apr 10 '19 at 22:14
I tried on the wiki link the same logic . its not working/. – ML Learner2 Apr 10 '19 at 22:59
what do you mean it is not working? did you run the demo and see the output? – chickity china chinese chicken Apr 10 '19 at 23:00
Its not printing all the hrefs and text – ML Learner2 Apr 10 '19 at 23:02
do you want to save the href and text to the file? or just print it? – chickity china chinese chicken Apr 10 '19 at 23:02
I am ok either way. to start with I want to print, issue I am having is wikipedia links – ML Learner2 Apr 10 '19 at 23:17
Where you are stuck .. what is the issue you are having with the wikipedia link? Your code is incomplete and you should show us where the issue is – chickity china chinese chicken Apr 10 '19 at 23:27

1 Answers1