0

I am building a site. Part of this site is a python script triggered by the user in the UI, which runs on the server. It produces HTML files. I have been reading the html with the node server and then sending the files in their entirety to the client to render as a document. These HTML files follow a standard format, the only thing changing is the data in the tables displayed.

This is obviously inefficient, as I'm sending a ton of HTML that doesn't need to be sent. I really just need to send an object with some data in it. I've created a view that is rendered, and then that page sends an Ajax request for the rest of the information.

Question: I'd like to parse the HTML files server side, extract the information from the HTML itself, and then use that to build an object to send off. Can I create a document object within the node environment? Is there anything that will read and understand the HTML so that I can parse through it quickly, similar to jQuery? Or can I simply load up jQuery within node, and if so, how?

.

For various reasons I cannot (or simply don't want to) change the output format of the script, I would like to build this python >> html >> node >> client pipeline.

jsarbour
  • 928
  • 1
  • 8
  • 20

2 Answers2

1

Is there anything that will read and understand the HTML so that I can parse through it quickly, similar to jQuery? Or can I simply load up jQuery within node, and if so, how?

Yes You can try using Cheerio Or Jsdom. The documentation should be straight forward.

But as you mentioned, if you just need to send an object with some data in it, then why parse HTML? Just send the object to server, and get back the results ?

JohnSnow
  • 444
  • 4
  • 17
  • I'll be more clear; the client sends a key to the server. The server uses this key to execute a script, generating 5-20 key/value pairs. These key/value pairs need to be sent back. Currently they are being placed into an HTML document in a table, and I'd like to extract them. – jsarbour Jul 17 '19 at 18:33
  • I don't understand why it's being placed in an HTML table if you don't need to do that. Cient sends the key, you generate the key value pairs on the server, why not just send the key value pairs to the front-end and render them in the way you need ? – JohnSnow Jul 17 '19 at 18:43
  • "For various reasons I cannot (or simply don't want to) change the output format of the script, I would like to build this python >> html >> node >> client pipeline." – jsarbour Jul 17 '19 at 18:46
  • Then yea, cheerio should help you out – JohnSnow Jul 18 '19 at 00:08
0

If you want jQuery your friend is cheerio. From the doc:

Cheerio implements a subset of core jQuery. Cheerio removes all the DOM inconsistencies and browser cruft from the jQuery library, revealing its truly gorgeous API.

you can use cheerio as you would jQuery.

const cheerio = require('cheerio')
const $ = cheerio.load('<h2 class="title">Hello world</h2>')

$('h2.title').text('Hello there!')
$('h2').addClass('welcome')

$.html()
//=> <h2 class="title welcome">Hello there!</h2>
Aritra Chakraborty
  • 12,123
  • 3
  • 26
  • 35