0

I apologise in advance for the (probably) very basic question. I spent a lot of time searching forums but my knowledge is too poor to make sense of the results.

I just need to get the HTML after the page has finished loading as almost all of the content is stored in div id="root">/div> but at the moment i just get that one line and nothing inside it.

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver

browser = webdriver.Chrome() #replace with .Firefox(), or with the browser of your choice
url = "https://beta.footballindex.co.uk/top-200"
browser.get(url) #navigate to the page



innerHTML = browser.execute_script("return document.body.innerHTML") #returns the inner HTML as a string
print(innerHTML)

Returns:

<div id="root"></div>
<script src="https://static.footballindex.co.uk/bundle_1537553245755.js"></script>

And this matches the innerHTML when you 'view page source'. But if i inspect element in my browser you are able to expand div id="root">/div> to see all the content inside and then I can manually copy all the HTML.

How do i get this automatically?

Many thanks in advance.

  • You have to wait for page to be ready. https://stackoverflow.com/a/30385843/2156813 – pawelbylina Oct 15 '18 at 09:33
  • In your case, html is being created via JS at the time of loading. You can validate it by viewing the source code of your url – Mohammad Zain Abbas Oct 15 '18 at 10:32
  • @MohammadZainAbbas thank you for your answer. How do i get Python / Selenium to get the HTML once it has loaded? At the moment i am only getting it before loading which is just equivalent to 'view page source' which doesn't contain what i need. Thanks. – Matt Fretwell Oct 17 '18 at 03:16

0 Answers0