1

I can't figure out how to get access to the full source of the HTML page including iframes. It should be similar to what we see in DevTools > Elements, but via Electron.

By source I mean either text representation of the DOM (including content of all iframes on the page), or the list of all elements and having a way to get access to their text-representations.

Any help is highly appreciated! Thanks.

Mark Dolbyrev
  • 1,887
  • 1
  • 17
  • 24
  • Can you clarify the question? What do you mean "but via Electron." You want some programmatic way to get DOM from the main process or something? – pushkin Jul 01 '22 at 15:17
  • @pushkin I want to open an HTML page in the electron app, the HTML page consists of some iframes, and I want to have a way to read the full DOM source (e.g. like Elements in Chrome DevTools) through the code, doesn't matter main or renderer processes. – Mark Dolbyrev Jul 02 '22 at 16:56
  • Do you just want a string of all the HTML, or do you want to be able to manipulate the DOM by retrieving actual DOM elements and doing operations on them? I don't think the latter is possible, but the former is straightforward – pushkin Jul 05 '22 at 13:23
  • I'll post an answer just to explain what I had in mind, but some additional questions: are we talking about a local HTML file? (and I forgot my other question but will post if I remember) – pushkin Jul 05 '22 at 13:37
  • @pushkin I am ok with just having a string of the HTML DOM (including content of all iframes on the page). – Mark Dolbyrev Jul 06 '22 at 07:16

1 Answers1

-1

If you're just looking to get a string of all the HTML, you can do so via the executeJavaScript API:

const {app, BrowserWindow, dialog} = require('electron')

async function createWindow () {
  const mainWindow = new BrowserWindow()

 await mainWindow.loadFile('index.html')

 const result = await mainWindow.webContents.executeJavaScript("document.documentElement.outerHTML");
 dialog.showMessageBox(mainWindow, {
    message: result
});
}

app.whenReady().then(() => {
  createWindow()

  app.on('activate', function () {
    if (BrowserWindow.getAllWindows().length === 0) createWindow()
  })
})

app.on('window-all-closed', function () {
  if (process.platform !== 'darwin') app.quit()
})

For an HTML page like:

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8">
    <link href="./styles.css" rel="stylesheet">
    <title>Hello World!</title>
  </head>
  <body>
    <h1>Hello World!</h1>
    We are using Node.js <span id="node-version"></span>,
    Chromium <span id="chrome-version"></span>,
    and Electron <span id="electron-version"></span>.
<iframe src="https://google.com/chrome"></iframe>
  </body>
</html>

You'll get a dialog like this:

enter image description here

It is not however possible to just grab DOM elements in the main process that you can manipulate.

pushkin
  • 9,575
  • 15
  • 51
  • 95
  • The problem with your solution is that it doesn't have the content of the iframe, I mentioned this in the original question. – Mark Dolbyrev Jul 06 '22 at 07:15
  • I don't think you can get the content if it's a cross origin iframe. See [here](https://stackoverflow.com/questions/6170925/get-dom-content-of-cross-domain-iframe#:~:text=You%20can't.,browser%20will%20allow%20you%20that.) – pushkin Jul 06 '22 at 13:50
  • 1
    you can try grabbing all iframes with `getElementsByTagName("iframe")` and looping over them and doing something like `frame.contentWindow.document.documentElement.innerHTML` but for a site like google.com/chrome, that returns `"<head></head><body></body>"` – pushkin Jul 06 '22 at 13:52
  • Yeah, I know that regular JS can't do it because of different security policies, but I thought there is a way to do it via electron API. – Mark Dolbyrev Jul 08 '22 at 10:00
  • 1
    Hm, I can try later but maybe you can grab the [webFrameMain](https://www.electronjs.org/docs/latest/api/web-frame-main) and go through the frames in the webContents and run the executeJavaScript function, logging `innerHTML` for each of them. That could work. Do you care if the html markup is out of order? Like is the html from above prepended to strings of the iframe contents OK? – pushkin Jul 09 '22 at 17:57
  • I'm not sure how to do this off the top of my head, but I wanted to weigh in and say that this probably _is_ possible, as Electron can be set to ignore security policies. I doubt that just disabling `webSecurity` would work, but you could try that. – Slbox Jul 09 '22 at 21:18