I am trying to extract some information in a pdf embedded in a web page using python and requests, And this is exactly the sentence I want to reach « Sciences de la vie et de l’environnement ».
Here is the code you wrote :
import time
import requests
from bs4 import BeautifulSoup
# website to scrap
url = "https://fs.uit.ac.ma/avis-de-soutenance-dune-these-de-doctorat-mme-achachi-hind/"
with requests.session() as s:
# get the url from requests get method
html_content = s.get(url, verify=False)
# Parse the html content
soup = BeautifulSoup(html_content.content, "html.parser")
url2 = soup.iframe["src"]
html_doc = s.get(url2, verify=False).text
print(html_doc)
Here's some of what print(html_doc),
When comparing the two pictures, I can't see what's inside in the last picture :
<div id="viewer" class="pdfViewer"></div>
Where inside this line is the writing that I want :