0

I am trying to count the number of opening <trs and compare them to the number of closing /tr>s to check for template errors in my HTML generation. My code is pretty simple:

var markup = document.documentElement.innerHTML; //to have it as a string
var trstart_results = (markup.match(/<tr/g) || []).length;
var trend_results = (markup.match(/\/tr>/g) || []).length;

Problem is, that I have in HTML code 95x tr and 97x /tr, but console.log says both are 97x.

Anyone know, what is wrong with this code?

mx0
  • 6,445
  • 12
  • 49
  • 54
  • 4
    [TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) ---- don't use regex to parse HTML, if you have opening tags, you surely have closing tags, so just count the elements instead – adeneo Aug 25 '17 at 18:12
  • 1
    Why don't you just traverse the DOM to the tr element, and just take the length of it's outer HTML? – Tezra Aug 25 '17 at 18:14
  • Well, if the length of the open TR is different from closing TR it means somebody messed up HTML template :-) And that is exactly what is this algo for. – Jaroslav Huss Aug 25 '17 at 18:19
  • oh, I understand, I think this is a language based miscommunication – Will Barnwell Aug 25 '17 at 18:22
  • If my edit clarification is correct, this is actually a fine use of regex in regards to markup language, OP isn't trying to parse arbitrary ml, just count occurrences of a pattern. – Will Barnwell Aug 25 '17 at 18:27
  • OP, what do you mean by html code? Can you reproduce this bug with a table with 3 rows? Or provide an example giving a result which is incorrect in your opinion? – Will Barnwell Aug 25 '17 at 18:30
  • Exactly as @WillBarnwell said - I am trying to get real count of "TR" and real count of "/TR"... If tr - /tr does not match, there is a problem in the code and needs to be solved. – Jaroslav Huss Aug 25 '17 at 18:30
  • Please provide some html which gives incorrect results – Will Barnwell Aug 25 '17 at 18:32
  • But you're using `innerHTML`, if the HTML isn't valid, and there's missing closing tags, it's a bit late to start validating and editing with clientside code. – adeneo Aug 25 '17 at 18:37
  • without the html content, it is very hard to predict what's the issue. Your regex looks fine to me – Sagar V Aug 25 '17 at 19:07

3 Answers3

0

Your code is absolutely fine, but browser inserted missing html tags. Look at this question from stack:

Do browsers automatically insert missing HTML tags?

ermacmkx
  • 439
  • 5
  • 12
0

I made a library that does something like this. I was dealing with badly formatted HTML that even html parsers couldn't properly handle. But some tags were always correct.

https://github.com/icodeforlove/balanced.js

balanced.matches({
    source: document.documentElement.innerHTML,
    head: /<tr[^>]*>/,
    open: /<tr[^>]*>/,
    close: '</tr>'
});
Chad Cache
  • 9,668
  • 3
  • 56
  • 48
0

so after a lot of tries I found finally solution which worked for me :-). First of all, I need to put the whole document as a string. ->

var markup = document.documentElement.outerHTML;

But this is not enough since the browser corrects your code. So you need to this string split to array. ->

html_arr = markup.split(" ");

Now we can manipulate all the array without any browser limitation. So my final code is :

var markup = document.documentElement.outerHTML;
html_arr = markup.split(" ");
var trcounter = 0;
var trendcounter = 0;
for (i = 0;i<html_arr.length;i++){
    if (html_arr[i].match(/<tr/g)){
        trcounter += 1;
    }else if (html_arr[i].match(/<\/tr>/g)){
        trendcounter += 1;
    }
}
console.log(trcounter);
console.log(trendcounter);

This code is wild and working with over 8000 fields in array so in a very big html code it will get like 3 seconds to execute.

Anyway, I am so happy I solved this and also very happy I could share with you results :-)

Best Luck guys!