0

I'm trying to scrape a website in PHP using DOMDocument.

However the site I'm trying to scrape, when I view the source code, doesn't actually print text but rather has these (placeholders?) in double curly brackets.

<td>{{variable.code}}</td>

So when you view the website, that {{variable.code}} might be text saying ABC1234, but in the source of the page it comes back as the above.

I'm not sure what's going on, whether this is a PHP thing or a JavaScript thing. Googling about curly brackets didn't reduce my confusion too much. Either way, I'm looking for some advice as to how I can extract the correct values.

halfer
  • 19,824
  • 17
  • 99
  • 186
  • 3
    You need a web scraper that can run JavaScript. I suspect it is running Angular or something like that. Consider running something like PhantomJS to do your scraping - it will be slower than DOMDocument, but it will run the JS you need to view the actual content. – halfer May 23 '18 at 15:37
  • 1
    Those are placeholders for either a page that utilizes vue, angularjs, etc. – Get Off My Lawn May 23 '18 at 15:37
  • 1
    I think the double curly brace is an angular (javascript) thing https://stackoverflow.com/questions/17878560/difference-between-double-and-single-curly-brace-in-angular-js – IrkenInvader May 23 '18 at 15:37
  • If you want the data you will need to run the javascript on the page. You can use http://phantomjs.org/ to do that. – Get Off My Lawn May 23 '18 at 15:40

0 Answers0