1

Is there a way to use DOM methods such as getElementById("id") or getElementsByTagName("tag") on a file that is selected by the user?

My program has the user select a set of HTML files that are formatted in a specific way that is somewhat difficult to search through when treated like one big string, but would be very easy if I could use DOM methods.

I am currently using the HTML5 File API to get a FileList and a FileReader object to parse the files and create strings.

Is using DOM methods possible, or alternatively, is there a better way to parse selected HTML files? A library like JSoup would be helpful, is there something similar in JavaScript?

Ian
  • 50,146
  • 13
  • 101
  • 111
Langston
  • 1,083
  • 10
  • 26
  • So you want to parse HTML on the server? What server side language are you using? or do you want to parse html read on the client side? (i.e. not uploaded) – Alex K. Aug 09 '13 at 17:02
  • @AlexK. `is there something similar in JavaScript?` – Ian Aug 09 '13 at 17:02
  • 1
    By "uploaded" do you mean "Selected in a file element", because that is what it sounds like you are describing, and it is happening on the client before any uploading takes place. – Quentin Aug 09 '13 at 17:02
  • Yes, thank you, it is being selected in a File element, not uploaded. My mistake. – Langston Aug 09 '13 at 17:04
  • 2
    As in; [Converting HTML string into DOM elements?](http://stackoverflow.com/questions/3103962/converting-html-string-into-dom-elements) – Alex K. Aug 09 '13 at 17:05
  • It would be great if you could use https://developer.mozilla.org/en-US/docs/Web/API/DOMParser but it doesn't seem to have good support for HTML – Ian Aug 09 '13 at 17:06
  • That's the way to do it then as far as I can see, what the objection? – Alex K. Aug 09 '13 at 17:09
  • Could someone provide me an example of how converting the HTML document into a mass of DOM elements could help me to use methods like getElementById("id")? Some example code would be a great help. – Langston Aug 09 '13 at 17:12
  • No idea if this would work, but could you use JavaScript to create an iFrame, then inject the html into the iFrame using .innerHTML and use the DOM functions to manipulate? – Joshua Dwire Aug 09 '13 at 17:16
  • @Ian Although it doesn't provide universal support, the DOMParser does exactly what I need it to. I don't have enough reputation to upvote your comment but if you submit it as an answer I will certainly accept it as correct. – Langston Aug 09 '13 at 17:31
  • @BrianB. Sure thing, just added :) – Ian Aug 09 '13 at 17:39

2 Answers2

0

If you have the full text of the file, you could always load it into a div, which would then cause the browser to generate DOM for said file. You could then use the methods you described.

0

According to MDN, there isn't good compatibility for DOMParser using text/html as the type, it seems to still parse fine with application/xml when providing HTML. Here's an example:

var parser = new DOMParser(),
    str = "<!DOCTYPE html><html><head><title>Test</title></head><body><div id='div1'>asdf</div></body></html>",
    doc = parser.parseFromString(str, "application/xml");

console.log(doc.getElementById("div1").textContent);

DEMO: http://jsfiddle.net/8AzBp/1/

Reference:

Ian
  • 50,146
  • 13
  • 101
  • 111