1

I am trying to crawl this website https://www.wego.ae/en/flights/searches/cDXB-cSFO-2020-03-09:cSFO-cDXB-2020-03-22/economy/1a:0c:0i?sort=price&order=asc and I can't find any of the actual page content on the html code that appears on the inspector.

Are they using some sort of frame to hide the code? Anyone had any experience with this?

Valentino
  • 2,980
  • 1
  • 14
  • 17
  • I was able to right click and open dev tools and view the HTML on the page, such as buttons, flight times, etc. I just expanded all of the elements under the `` tag. What's the issue you are seeing here? – CEH Nov 05 '19 at 21:28
  • You can right click on elements of course, but if you go to inspector and copy the whole body html code you will not see any of the content. Is like some sort of frame like I said before, but not sure what kind (not an iframe). And basically when trying to crawl the website you get the html code and no content is there. – Valentino Nov 05 '19 at 21:36
  • Why are you trying to copy the whole body HTML code? There are many iframes on this page, but I've never seen them causing issues like this. More common is for JS-dynamic loaded elements to give issues. What happens when you run Selenium code against this website and try to locate elements? Is there an error message displaying? – CEH Nov 05 '19 at 21:41
  • When using a simple get on selenium you see the same as if you do a "view page source" on the browser, so none of the dynamic content is there. – Valentino Nov 05 '19 at 23:02

1 Answers1

0

The content of the page is stored in the #shadow-root element. You can read more about the ShadowRoot in the Mozilla Developer docs.

If you need help traversing elements within a shadow DOM, you can reference this answer.

Lorn
  • 194
  • 1
  • 5
  • Good info! Do you know if this works with selenium and firefox? I see the example you give is for chrome driver – Valentino Nov 05 '19 at 23:24
  • This article has a great overview: https://www.seleniumeasy.com/selenium-tutorials/accessing-shadow-dom-elements-with-webdriver. In Chrome, you can get the `#shadow-root` element using the Javascript executor, then access elements within it. In Firefox, it depends on the version but Shadow DOM is not supported by default until v52. – Lorn Nov 06 '19 at 04:02