1

I have a test project using a library supporting crawl(openbuilding spiderling). The problem is when i crawl on url "https://examlple.com". This page content a iframe from "https://iframe.com".

I want to get the element p(s) inside the iframe. But i now only can get those elements by visit iframe.com. I want to know that is there anyway to get element p even when i don't visit iframe.com, such as wait for ifame loaded. Thank you!

Ngo Tuan
  • 205
  • 1
  • 2
  • 16

1 Answers1

2

No, you cannot spider an iframe's contents from the parent page. The closest you can do is note the URL of the iframe and then go off and independently spider it.

Think of an iframe as a sandboxed and protective container that only lets you visually view its contents and nothing more - no spidering or talking to it (unless you own the page and are working with JavaScript Window.postMessage() etc.

Samuel MacLachlan
  • 1,736
  • 15
  • 21