Code that will alert DOM Position

Question

Possible Duplicate:
JQuery to check for duplicate ids in a DOM

Suppose i have a code:

<div id="one">
    <div id="two"></div>
    <div id="three"></div>
</div>

<div id="four">
    <div id="two"></div>
</div>

<div id="one">
  <p id="five">
    <span id="three"></span>
  </p>
</div>

(a large HTML code with different DOM items).

Objective:

Is it possible to build a jQuery or JavaScript code that will alert me about duplication of ids within the document with the position. Here the position means like following;

> duplicate id: 'div#two' > within `div#four`, `div#one` 
> duplicate id: 'div#one' > parent of `p#five`
> duplicate id: 'span#three' > within `p#five` and such a pattern.

Note:

I found a problem similar to me, but not exact. As it is not duplicate of any question asked before. So don't CLOSE IT.

What for? Why JavaScript? *Adding* JavaScript to this page to find the duplicates is not really useful IMO. You can also use the W3C validator to find duplicate IDs: http://validator.w3.org/check — Felix Kling, Jun 30 '11 at 10:43
Line number probably not - I doubt you can get that from the browser (unless InnerHTML preserves whitespace from the original and you parse it out of the document - probably not worth it even if it is possible) - but position in the DOM should be achievable. — Rup, Jun 30 '11 at 10:43
@Felix Kling I know that it can possible using texteditor. But I wish if can do it with jquery or javascript it will be very pleasant to me! So I want all experts help.. — thecodeparadox, Jun 30 '11 at 10:47
@bazmegakapa its not duplicate of that. that will alert only ids but i also want line-number or position. so please not vote for close — thecodeparadox, Jun 30 '11 at 10:49
@abdullah Using a validator will give you all the line numbers, besides finding other errors as well. — kapa, Jun 30 '11 at 10:53
@bazmegakapa i know it. but i want to if it is possible to do it using jquery or javascript just for egarness. so i asked all of your expert help as learner. — thecodeparadox, Jun 30 '11 at 11:02
@Felix I'm another vote for not using JavaScript. If OP wants it as a debugging tool, he should be looking into add-ins and extensions (like the solid [Validity](https://chrome.google.com/webstore/detail/bbicmjjbohdfglopkidebfccilipgeif) for Chrome). Also, the hard part of what OP is asking is line numbers. It's doable with regex, but not really reasonable. — brymck, Jun 30 '11 at 11:05
@Bryan thanks for your comment to Felix. You mention in your comment about `regex`. I find one clue from your comment. thank again — thecodeparadox, Jun 30 '11 at 11:15
@bazmegakapa — more of a zombie idea, it smells bad, tries to bite you, and yet people keep bring it back. — Quentin, Jun 30 '11 at 13:13
@bazmegakapa i change my question. Is it now possible? please... — thecodeparadox, Jun 30 '11 at 13:44

jsonnull · Answer 1 · 2012-02-09T01:35:31.273

4

If all you want to do is root out duplicate ids, you should validate your html.

http://validator.w3.org/

This will alert you to duplicate ids and make sure your code is well formed.

edited Feb 09 '12 at 01:35

answered Jun 30 '11 at 10:44

jsonnull

415
3
12

Hi Friend, I know, but its not a matter of validation, bcos i know its not valid due to id duplication. now i want to make it using jquery or javascript. i expect all your suggestions on what i asked.. – thecodeparadox Jun 30 '11 at 10:53
Actually there is a Validator API, never played with it: http://validator.w3.org/docs/api.html – kapa Jun 30 '11 at 13:58

brymck · Accepted Answer · 2011-06-30T13:30:36.767

NOTE: Read all caveats. The point of this code is to illustrate the nature of the problem, which is that a pure JS solution is inadvisable.

First of all, hopefully what this is illustrates is that sometimes things that are doable are not always advisable. There are a ton of awesome tools out there that will provide far better error checking, like W3C's validator or add-ins/extensions that utilize it, like Validity for Chrome. Definitely use those.

But anyway, here's a minimalist example. Note that none of the DOM has references to its own line number, so you have to get the entire innerHTML attribute from the documentElement as a string. You match parts of that string, then break it into a substring at the match position, then count the number of carriage returns. Obviously, this code could be extensively refactored, but I think the point is clear (also jsFiddle example for those who want it, although the lines will be fubar):

EDIT

I've updated the regex to not match examples like <div>id="a"</div>. Still, if the OP wants something pure JS, he'll have to rely on this or a considerably more complex version with very minor benefits. The bottom line is that there are no associations between DOM nodes and line numbers. You will have to, on your own, figure out where the ID attributes are and then trace them back to their position. This is extremely error-prone. It might make some sense as programming practice but is extremely inadvisable in the real world. The best solution -- which I'm reiterating for the fourth time here -- is an extension or add-in that will just send your page on to a real validator like the W3C's.

The code below is designed to "just work," because there is no good way to do what the OP is asking.

<!DOCTYPE HTML>
<html>
  <head>
    <meta http-equiv="content-type" content="text/html; charset=utf-8">
    <title>Test</title>
  </head>
  <body>
    <div id="a"></div>
    <div id="a"></div> <!-- catches this -->
    <div id="b"></div>
    <div>id="a"</div>
    <div id="c"></div>
    <div id="c"></div> <!-- catches this -->
    <span>[id="a"]</span>
    <script>
    var re = /<[^>]+id="(.*?)"[^>]*>/g; // match id="..."
    var nl = /\n\r?/g;                  // match newlines
    var text = document.documentElement.innerHTML;
    var match;
    var ids = {};                       // for holding IDs
    var index = 0;
    while (match = re.exec(text)) {
      // Get current position in innerHTML
      index = text.indexOf(match[0], index + 1);

      // Check for a match in the IDs array
      if (match[1] in ids) {
        // Log line number based on how many newlines are matched 
        // up to current position, assuming an offset of 3 -- one
        // for the doctype, one for <html>, and one for the current
        // line
        console.log("duplicate match at line " +
          (text.substring(0, index).match(nl).length + 3));
      } else {
        // Add to ID array if no match
        ids[match[1]] = null;
      }
    }
    </script>
  </body>
</html>

This just seems to work, regex is just not the right way to use when you're parsing HTML. For example: `
id="a"
` certainly should not be caught as an error, just like ``. — kapa, Jun 30 '11 at 12:02
@baz Really, -1? I'm pretty sure **the entire point of my answer** was to show how ugly you have to get to extract line numbers from DOM nodes. Anyway, I updated the regex to account for outliers that will likely never crop up in OP's code. Of course, it will still fail on `
` or something like that. — brymck, Jun 30 '11 at 12:17
@Bryan [Read this](http://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-rege) You can never create a HTML parser with regex that will not have problems. Impossible. Though there are HTML parsers [available](http://ejohn.org/blog/pure-javascript-html-parser/) that might be better suited. My downvote was for the regex HTML parsing, it will never work. — kapa, Jun 30 '11 at 13:07
@baz I am **100% aware** that you cannot parse HTML with regex with 100% accuracy. But a) the point is not to recreate the DOM, but find one particular attribute with **decent** accuracy and calculate newlines. And b), most obviously, **I gave the OP caveats and told him to use something else**. "There are a ton of awesome tools out there that will provide far better error checking... Definitely use those. ... Obviously this code could be extensively refactored." Absolutely nowhere did I say this was the best solution or even a good one, merely that it was closest thing resembling an answer. — brymck, Jun 30 '11 at 13:24
@Bryan You don't *have to use regex to get line numbers* as you state. As you said, your answer is about the caveats and it's about using something else. Even though you give him a *"working"* code, which all of us knows he or someone *will* use. Nothing personal with the downvote, but I still think this is a bad answer. — kapa, Jun 30 '11 at 13:33
+1 for full mention of how bad an idea this is and doing it anyway — Radu, Jun 30 '11 at 13:35
@baz @Radu The other solution is `split`ting `innerHTML` on `"\n"`, but that's not really the major problem here as the regex doesn't do anything terribly different. — brymck, Jun 30 '11 at 13:44
@Radu If I had time I would do some research. Thank God regex is not the only tool of a programmer. — kapa, Jun 30 '11 at 13:48
@Bryan If you don't use regex, you certainly have less problems :). — kapa, Jun 30 '11 at 13:49

Code that will alert DOM Position

2 Answers2