I am trying to parse a HTML page using BeautifulSoup
which has text files, ending with the .txt
extension. I want to parse the HTML, and fetch the string that ends with .txt
.
All such strings are within a <a href>
tag and here are some examples:
<a href = "foo.txt">
<a href = "bar.txt">
How do I get the foo.txt
and bar.txt
.
I did this:
>>> links = soup.findAll('a')
But I can't find how to extract the complete string... Any suggestions?