1

I wanted to count number of characters in a string which I pull from div tag like this:

$(".welcome-msg").html().length;

However it's counting HTML comments as characters inside the tag. So when I would like the result to be 0 I get 99 because of those comments and I don't have a way of telling if comments are dynamic or not. Is there an easy way to make sure comments are not counted? Or do I have to write regular expression for that?

Thanks,

bobek
  • 8,003
  • 8
  • 39
  • 75
  • 3
    Have you tried using `text()` instead? Or do you need the string to include the html elements and such..? – David Thomas Apr 09 '12 at 16:29
  • No, there's not an easy way to do that. You need an HTML parser. You can't use regular expressions to parse HTML because HTML is not a regular language. – Dan A. Apr 09 '12 at 16:30
  • Thanks David - it works. @Dan A. Yes I can - I am using jquery and html is just a string in this context. – bobek Apr 09 '12 at 16:33
  • 1
    David brings up a good point, if you want to strip any HTML tags, text() would be an easy way to do it. – Dan A. Apr 09 '12 at 16:33
  • @bobek, for the record: it is not a matter of HTML being a string or not. – Alexander Apr 09 '12 at 16:37
  • @bobek are you looking for actual text length or html length without comments? – iambriansreed Apr 09 '12 at 16:41

2 Answers2

4

You can filter comments out, but it is not easy. I will show you how you can filter them on the first level, which is easy, but if they are nested within other tags, then you need to do additional logic.

The key is to .contents() to get all the nodes within. This includes comment nodes. Then you can filter the comment nodes out by comapring against nodeType.

So it would be something like this:

$(".welcome-msg").contents().filter(function() { 
  return this.nodeType != 8;
}).appendTo("<div>").parent().html();

That will work for

<div class=".welcome-msg">
   <!--Comment --><span>hello</span>
</div>

But not for

<div class=".welcome-msg">
    <span><!--Comment -->hello </span> world 
</div>

You would need to iterate through all tags recursively and then it will work for everything.

With regular expressions you would need to be careful about <script> tags and <style> tags.

Here is the jsfiddle

Update (Recursive filter)

Doing it recursively is actually quite easy:

http://jsfiddle.net/xYR5p/3/

Made an entire plugin for it:

$.fn.removeComments = function() {
    this.contents().filter(function() {
        return this.nodeType == 8;  
    }).remove();
    
    this.children().each(function() {
       $(this).removeComments(); 
    });
    
    return this;
};


console.log($(".welcome-msg").clone().removeComments().html());​
d_inevitable
  • 4,381
  • 2
  • 29
  • 48
3
var myhtml = $(".welcome-msg").html();
myhtml = myhtml.replace(/<!--.*?-->/sg, ""); 
myhtml.length();

regex from here StackOverflow: Remove HTML comments with Regex, in Javascript

Community
  • 1
  • 1
Ryan
  • 3,153
  • 2
  • 23
  • 35
  • 1
    -1 Why `replace` and not just `.text()`. Seems like overkill. – iambriansreed Apr 09 '12 at 16:38
  • 3
    This seems oversimplified to me. What about this as an example:

    Here's some text I want to not be stripped

    – Dan A. Apr 09 '12 at 16:41
  • @DanA. are you being sarcastic? – iambriansreed Apr 09 '12 at 16:45
  • @iambriansreed Why do you think I'm being sarcastic? I'm just trying to point out a pitfall of trying to parse HTML with regular expressions. If you're looking for the definitive (and sarcastic) answer to this, look no further: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Dan A. Apr 09 '12 at 16:52
  • bobek asked about html comments inside a div. Sure .text() is more obvious, but this is a method that would work to do what he's asking. DanA. your comment isn't helpful and improbable, and doesn't address the problem. Why don't you offer a helpful solution, instead of criticizing those who are? – Ryan Apr 09 '12 at 16:55
  • @alightd i have to agree with DanA, because improbable is not good enough and hes being helpful by pointing it out. But, did you get a downvote or something? – d_inevitable Apr 09 '12 at 17:01
  • @alightd I never meant to criticize. I totally agree that your solution would work most of the time. However, if I can construct an example (regardless of how improbably it may be) where your answer fails, I think it's helpful for educational purposes to do so. Also, I'm endorsing the nodeType solution proposed by d_inevitable, so I am not adding my own answer. – Dan A. Apr 09 '12 at 17:08