2

I have a webpage that I am scraping using Beautiful Soup. I've got the HTML, but now I need the CSS. I have tried using soup.findAll('link', {'rel': 'stylesheet'}) but I can figure out how to get the filename from the 1st index of the returned list, <link href="styles.css" rel="stylesheet"/>

I have tried using regex, which I'm not very good at, but I'm not sure how to get it to work.

So, is there a BeautifulSoup function that I can use or do I have to go the route I'm already taking?

Jordan Baron
  • 3,752
  • 4
  • 15
  • 26

1 Answers1

3

If you're using BeautifulSoup4

for link in soup.find_all('link', href=True):
    print "Found the URL:", link['href']

If your using version 3

for link in soup.findAll('link', href=True):
    print "Found the URL:", link['href']
Melchia
  • 22,578
  • 22
  • 103
  • 117
  • 1
    Another way of finding the link, just tried it so thought I would share. BeautifulSoup4: pass the tag's rel attribute as a filter - `link_tags = soup.findAll("link", rel="stylesheet")` and then `for each_tag in link_tags: csslink = each_tag["href"]` - source https://stackoverflow.com/questions/2612548/extracting-an-attribute-value-with-beautifulsoup – rishijd Feb 22 '18 at 22:43