1

Im collecting titles and images from website into a human-readable format.

I use fs.writeFile and options are:

  1. save as html (where its opened locally) or,
  2. have it sent to email by nodemailer.

Either way I need the info in table format in html. Top row = Title, Price, Image (displayed, not link). Columns =list of items.

I added in a portion to convert JSON to html table but its messing up. Now the script does not run. Error is document not defined (in table formation).

Separately, if theres any way to auto-send the list to emails daily without maintaining a server, kindly let me know too.

const puppeteer = require('puppeteer');
const fs = require('fs');

/* this gets the json data, all working ok */
async function newCam() {
   const browser = await puppeteer.launch({ headless: false });
   let page = await browser.newPage();
   await page.goto('https://sg.carousell.com/search/products/?query=camera', { waitUntil: 'networkidle2' });
   let results = []; 
   let elements = await page.$$('div.U-U');
   for (let element of elements) {
      let listTitle  = await element.$eval('div.U-m', node => node.innerText.trim());
      let listImg    = await element.$eval('.U-p img', img => img.src);
      let listPrice  = await element.$eval('div.U-k :nth-child(1)', node => node.innerText.trim());
      results.push({ 
         'Title': listTitle,
         'Img':   listImg,
         'Px':    listPrice 
      });
   }
   await browser.close();
   return results;


   /* format json into table and feed into fs below */
      // get header keys
      var col = [];
      for (var i = 0; i < results.length; i++) {
         for (var key in results[i]) {
               if (col.indexOf(key) === -1) { col.push(key); }
         }
      }

      // create table 
      var table = document.createElement("table");
      var tr = table.insertRow(-1);                   // insert header row.
      for (var k = 0; k < col.length; k++) {
         var th = document.createElement("th");      // fill header
         th.innerHTML = col[k];
         tr.appendChild(th);
      }
      // add json data as rows
      for (var a = 0; a < results.length; a++) {
         tr = table.insertRow(-1);
         for (var f = 0; f < col.length; f++) {
               var tabCell = tr.insertCell(-1);
               tabCell.innerHTML = results[a][col[f]];
         }
      }

   /* save to html on local drive with fs */ 
   fs.writeFile('/data.html', table, (err) => {
      if (err) throw err;
   });
}
newCam();
Thinkerer
  • 1,606
  • 6
  • 23
  • 43

1 Answers1

2

Why your code is not working

You are trying to use the DOM inside the Node.js environment. Node.js executes JavaScript on the server-side. So there are no DOM variables (like window or document) that you can access. Therefore you are getting the error document is not defined.

For more information regarding that topic you might want to check out the question "Why doesn't Node.js have a native DOM?"

Table creation

If you want to create the markup of an HTML table, you can either use string concatenation and simple merge together the table on your own or use something like jsdom to simulate a DOM on the server-side.

As your case seems to be rather simple, I would go with the first option.

Here some rather simple code to create the HTML markup for the table. You can put it into your code instead of your "create table" code and it will produce a table with one column for each value inside col.

function escapeHtml(str) { // for security reasons escape "<" (you could even improve this)
    return str.replace(/</g, '&lt;');
}

const htmlTable = '<table>'
    + `\n <tr>${col.map(c => '<th>' + escapeHtml(c) + '</th>')}</tr>`
    + results // generate rows, use map function to map values to trs/tds
        .map(row => ('\n <tr>' +
            col.map(c => `\n  <td>${escapeHtml(row[c])}</td>`).join('')
        + '\n</tr>')).join('')
    + '\n</table>';

fs.writeFile('/data.html', htmlTable, (err) => {
    // ...
});

Of course, this code is a rather simple example to get you started.

Sending the document via mail

Instead of saving the HTML locally, you can also directly send it via mail by using nodemailer. Here is a code sample to get you started, but you might want to check out the nodemailer website for more information.

await transporter.sendMail({
    /* ... */
    html: 'Full HTML document.... ' + htmlTable + ' ...'
});
Thomas Dondorf
  • 23,416
  • 6
  • 84
  • 105
  • Thanks Thomas, appreciate the answer. I tried it and got this error instead. `Error: WebSocket is not open: readyState 3 (CLOSED)` followed by `(node:839) UnhandledPromiseRejectionWarning: ReferenceError: col is not defined` in reference to the `col.map`. I get your case on non-DOM terms on Node. col should work though. Oh and whats the term for the html table code, I would like to search and read up on it. Thanks! – Thinkerer May 21 '19 at 15:16
  • @Thinkerer Created a gist with the [full code thrown together](https://gist.github.com/thomasdondorf/bbc458143c88a937ef5caf5047203487). What you mean with "term for the html table code"? It's just HTML code merged together ;) – Thomas Dondorf May 21 '19 at 16:30
  • Thanks @Thomas it works. I meant where can i read up more on customizing the table? Like display image (from url, and allow hyperlink to url), outline the table, widen the second colums? Its different from the usual html. – Thinkerer May 22 '19 at 15:23
  • 1
    @Thinkerer This is just normal HTML ;) You could write `` to apply CSS for example. I just concatenated all HTML strings together.
    – Thomas Dondorf May 22 '19 at 16:09