0

I'm looking for a javascript template/lead that would accomplish the same thing as this PHP script below, but in an async way (?) so that the information is always accurately the same as the targeted website's... with no delay.

<?php
    $html = file_get_contents('https://www.restockcrc.com/all/');

        $re = '/<h4 class="card-title">.+?<a href="(\S+)">(.*?)<\/a>/ms';
        preg_match_all($re, $html, $matches, PREG_SET_ORDER, 0);

        foreach ($matches as $item) {
        echo sprintf('<a href="%s">%s</a><br>', $item[1], $item[2]); // Print results
}

I've been playing with Puppeteer but the documentation is huge and it's hard to find what I need.

I have NodeJS installed and Jade/Express.

  • 1
    It's not very clear what you're asking for help with? Do you want real-time updating for some value in a web page based on some change on the server? Or are you just asking for how to use Jade/Express in node.js to render a template? And, completely unclear what Puppeteer has to do with this at all since that is typically used for a server to get information from other web-sites. – jfriend00 Jul 31 '19 at 22:11
  • aka how do I screen scrape with node. – epascarello Jul 31 '19 at 22:15
  • Clarifications: Real-time updating for some values (card-title) of restockcrc.com/all. I was under the impression that "get information from other web-site" was the whole point(?). Unless I'm not using the correct terms. Try the php code in a basic php file. This output needs to be generated, in real time, from the restock url provided above. –  Jul 31 '19 at 22:17
  • 1. [parsing html with regex is of the devil](https://stackoverflow.com/a/1732454/5734311) 2. you don't need puppeteer to load a webpage from a domain 3. use the `http` module to grab the document, use jsdom to parse it –  Jul 31 '19 at 22:22
  • Would I use: var request = require("request"); request("http://www.google.com",function(error,response,body) and then filter with jsdom? –  Jul 31 '19 at 22:38
  • Basically yes, or `node-fetch`, or `https`, whatever you prefer. If you get stuck, put your attempt in the question and describe how it fails. –  Jul 31 '19 at 22:53
  • Thank you for clarifying. Could you point me to a template of how to filter the output of request to only show h4 class of "card-title"? Soon as I have a lead I can start. –  Jul 31 '19 at 23:03
  • `const links = new JSDOM(html).window.document.querySelectorAll('.card-title a');` –  Jul 31 '19 at 23:12
  • That code looks familiar :P Can you post your node code? – atymic Jul 31 '19 at 23:15
  • Let's cut this short: https://pastebin.com/FGnpTM38 –  Jul 31 '19 at 23:17
  • Hahaha, Atymic, wow. –  Jul 31 '19 at 23:19
  • >Chris This works fine. Is there a command to load as soon as it finished loaded? (loop it) –  Jul 31 '19 at 23:30

0 Answers0