-2

I've been trying to extract all the hyperlink from an internal sharepoint website using beautiful soup in python but whenever I'm trying to run the program I'm getting zero results. When I checked the view source of the website it also doesn't show any hyperlink. However I can see all the link using the inspect option in the browser. is there any way I can extract all those links using python.

Code:

def main():

    r=requests.get('https://abc[.]com/query?',auth=HttpNtlmAuth(spuser,getpass.getpass()))
print(r.status_code)
soup = BeautifulSoup(r.content, "html.parser")
for link in soup.find_all('div',{'class':"list_episode"}):    
    print(link)

The above code provide no results.

tomerpacific
  • 4,704
  • 13
  • 34
  • 52
  • 1
    You'll need to give us a sample of the code on the page. – yuuuu Dec 27 '21 at 18:09
  • Possible duplicate: https://stackoverflow.com/questions/1080411/retrieve-links-from-web-page-using-python-and-beautifulsoup – Ricky Dec 27 '21 at 18:10

2 Answers2

1

When I checked the view source of the website it also doesn't show any hyperlink.

The site may be using Javascript to dynamically fill in the links.

If so, you will likely need a browser to run the Javascript before you parse the links.

Selenium is a tool that you can run from Python to access the links. See: https://selenium-python.readthedocs.io

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
0
from bs4 import BeautifulSoup as Soup
import requests

url = "https://stackoverflow.com"
page = requests.get(url)

soup = Soup(page, "lxml")

links = [link.get('href') for link in soup.findAll('a')]

If this does not work, then submit another question with your source code and exact error

Kían
  • 70
  • 6