0

How can I retrieve the HTML page source using selenium java?

Thomas Martin
  • 666
  • 2
  • 6
  • 26

2 Answers2

3

getPageSource()

getPageSource() gets the source of the last loaded page. If the page has been modified after loading (for example, by Javascript) there is no guarantee that the returned text is that of the modified page. The page source returned is a representation of the underlying HTML DOM which is in the same way as the response sent from the web server.


Page Source

To get the Page Source you can use the following solution:

driver.get("https://www.google.com/");
System.out.println(driver.getPageSource());
driver.quit();

Element HTML

To get the HTML of a WebElement as an example of the input box on Google Home Page you need to you need to induce WebDriverWait for the visibilityOfElementLocated and using getAttribute("outerHTML") method you can use the following solution:

driver.get("https://www.google.com/");
WebElement inputField = new WebDriverWait(driver, Duration.ofSeconds(5)).until(ExpectedConditions.visibilityOfElementLocated(By.name("q")));
System.out.println(inputField.getAttribute("outerHTML"));

Console Output:

<input class="gLFyf" jsaction="paste:puy29d;" maxlength="2048" name="q" type="text" autocapitalize="off" autocomplete="off" autocorrect="off" autofocus="" role="combobox" spellcheck="false" title="Search" value="" aria-label="Search" data-ved="0ahUKEwjXj4ic9_H9AhVXAd4KHXJjCk0Q39UDCAQ">
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
0
WebDriver driver = new ChromeDriver();
driver.get("https://www.googel.com/");
String str = driver.getPageSource();
System.out.println(str);

and for python https://www.tutorialspoint.com/get-html-source-of-webelement-in-selenium-webdriver-using-python