0

I want to get the HTML and objects rendered on the site https://study.innspector.dbvis.de/ I'm using python and tried

from bs4 import BeautifulSoup
driver = webdriver.Chrome()
driver.get('https://study.innspector.dbvis.de/')
content = driver.page_source
soup = BeautifulSoup(content, "html5lib")
print(soup) 

also tried

with sync_playwright() as playwright:
  browser = playwright.chromium.launch(headless=False)
  page = browser.new_page()
  page.set_viewport_size({"width": 2220, "height": 1280})

  page.goto("https://study.innspector.dbvis.de/")
  skeleton = page.evaluate('document.body.innerHTML')

all I'm getting is <div class="bg-white h-full" id="app"></div> <script src="/index.38aaab33.js" type="module"></script> I want to get the elements rendered by the client-side javascript, Thank you.

Pranav Harer
  • 81
  • 2
  • 4
  • Nope, that site's body is exactly what you get – Jaromanda X Aug 09 '23 at 09:28
  • Go to the site, and use the "view source" option your browser offers (view source, _not_ "inspect elements" to view the DOM) - _that_ is what you got returned from the server, for the URL you requested. Everything else you see, has been added via client-side JavaScript. – CBroe Aug 09 '23 at 09:29
  • so how can I get those all elements, is there a way to get them? because that's what i want – Pranav Harer Aug 09 '23 at 10:29
  • Check the duplicate above, that this question was closed with. – CBroe Aug 09 '23 at 10:49

0 Answers0