-2

I am very new to JavaScript and I have been tasked by my job to make a HTML Parser that can go through lines of a html file and find say an ID tag and then match it with an excel sheet (or CSV), and then swap the ID with a value from the spreadsheet / CSV file.

NOTE: I am not asking you to do it for me; just I have looked through loads and I am just not aware of the right parser I need. A point in the right direction would be great thanks.

Here is an example of a HTML document with IDs: (The ID's are prefixed with #IDHERE# as I don't know how to prefix an ID without the parser yet)

<html>
  <head>
    //Header Data Here
  </head>
  <body>
    <h1>#ID_MainTitle#</h1>
    <p>#ID_Para1#</p>
  </body>
</html>

Here is a table (Lua) (could be excel etc, but just for an example):

{
  ["ID_MainTitle"] = "Hello World",
  ["ID_Para1"] = "This is a test!",
}

This would need to be the end result:

<html>
  <head>
    //Header Data Here
  </head>
  <body>
    <h1>Hello World</h1>
    <p>This is a test!</p>
  </body>
</html>

I know it's not much help but I did have a look but none of the ones I found looked remotely like what I need.

halfer
  • 19,824
  • 17
  • 99
  • 186
Dahknee
  • 591
  • 3
  • 12
  • 28
  • 1
    I wouldn't build an actual parser. Maybe you can just create a DIV element, put the innerHtml of the document in it, and traverse the DOM to find the elements. Writing an actual parser isn't very easy. – GolezTrol Dec 04 '14 at 12:11
  • [What have you tried](http://mattgemmell.com/what-have-you-tried/) so far? I agee with @GolezTrol, you don't need to write a parser, you should just process the DOM tree. – lexicore Dec 04 '14 at 12:14
  • I am not sure what you mean? The HTML document will be a master page and then the IDs will be swapped to values and this will spit out many different versions of the master with the values from many tables, I am just trying to start off small. – Dahknee Dec 04 '14 at 12:15
  • 3
    This is called templating, and theres a zillion libraries for it, check out mustache for example. – simonzack Dec 04 '14 at 12:15
  • @lexicore I have looked at many types, trying anything I just get confused, to be honest I am not a JavaScript programmer, so I am trying to teach myself, I have the basics but this is a little beyond me and thats why I came here, I (honestly) don't know where to start – Dahknee Dec 04 '14 at 12:16
  • @torazaburo no I looked, not what I need. – Dahknee Dec 04 '14 at 12:57
  • @torazaburo Very rude. It's a project to keep me busy. – Dahknee Dec 04 '14 at 13:05
  • By the way, someone already wrote a parser. It's called DOMParser. See https://developer.mozilla.org/en-US/docs/Web/API/DOMParser. –  Dec 04 '14 at 13:05

3 Answers3

2

As one commenter said, you've re-invented the notion of HTML templates. People have already written dozens if not hundreds of templating engines.

To do this, you do not need to "parse HTML". You essentially view the HTML as a big string with magic placeholders (in your case, things between #), and the template engine is essentially a kind of macro processor which does string replacement. You can do anything you want with the resulting, interpolated HTML: save it to a file, or send it down to the browser, or if you are already in the browser stick it in the DOM.

Write your template like so (let's say this is in a string called templateSource):

<html>
  <head>
    //Header Data Here
  </head>
  <body>
    <h1>{{ID_MainTitle}}</h1>
    <p>{{ID_Para1}}</p>
  </body>
</html>

Define your values for interpolation:

data = {
  ID_MainTitle: "Hello World",
  ID_Para1: "This is a test!"
};

Then compile the template and run it:

var template = Handlebars.compile(templateSource);
var result = template(data);

Your resulting, interpolated HTML will be in result.

There are minor differences depending whether you want to do this in a server like node, or on the browser.

These days, JS programmers do less "programming", and more calling of APIs. A key skill is to find the thing that has already been written and figure out how to use it and glue it together with other components.

0

Would your script run in the browser or on the server? If it is running in the browser then do you need to parse the same page or an external HTML? If it is the same page I wouldn't do a parser but rather process the DOM. I.e. Enclose your tags to be replaced DIV or SPAN tags with specific IDs then look them up in the DOM and replace.

If your script should process external HTML then you can still use the approach above. There is a solution for that at Parse a HTML String with JS

If it is on the server then use node and html-parser module.

Community
  • 1
  • 1
Zoltan Magyar
  • 874
  • 1
  • 6
  • 19
  • I am going to use Node – Dahknee Dec 04 '14 at 12:20
  • Then take a look at the node module here: https://www.npmjs.org/package/html-parser – Zoltan Magyar Dec 04 '14 at 12:21
  • I looked at this but this is where I got confused, how would I get it to read from a HTML file and a Excel sheet and match the IDs with say the first column of the excel sheet then swap the id's with the values in the second column, if needed, use a CSV file instead? I am just hopeless – Dahknee Dec 04 '14 at 12:24
  • To open a file: `fs = require('fs'); fs.readFile('/yourpath/your.html', 'utf8', function (err,data) { if (err) { return console.log(err); } console.log('Content: ' + data); }); ` There are modules to deal with excel, e.g. https://www.npmjs.org/package/excel-parser but I would rather use CSV as it is more simple to process. Don't forget, google is your friend :) – Zoltan Magyar Dec 04 '14 at 12:30
0

You could do something like:

(function(){
    for ( var i = 0; i < YourData.length - 1; i++ ) {
        document.getElementById(YourData[i].ID).innerHtml = YourData[i].Content;
    }

})();

Obviously for the above to work your data needs to reside in an array where each element has the necessary values (ID and Content in the above example).

This will work for simple HTML tags, but you will run into problems with (for example) nested DIV's. Anyway, it may provide you with an idea of how you could do this in javascript.

Kristian
  • 278
  • 2
  • 6