1

So I'm analysing a bunch of HTML in javascript so that I can trim this article down to 30 characters-ish and append something like '...READ MORE' etc. Problem is that the block of text is spaced out with BR tags. Here's my code;

<head runat="server">
<title></title>
<script type="text/javascript"src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
<script type="text/javascript">
    window.onload = function onLoad() {
        alert("Start");
        var intros = this.document.getElementById('EventContent');
        if (intros) {
            for (i = 0; i < intros.length; i++) {

                var els = intros[i].childNodes;
                if (els) {
                    for (j = 0; j < els.length; j++) {
                        alert(els[j].tagName)
                    }
                }
            }
        }
        alert("Start");
    }

</script>
</head>
<body>
<form id="form1" runat="server">
<div id="EventContent">
<br />


<br />
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
<br />
</div>
</form>
</body>

But my alert Tagname never displays although I do get alerted START and END. What am I doing wrong? How can I catch those stupid lineBreaks...

windowsgm
  • 1,566
  • 4
  • 23
  • 55

2 Answers2

1

There is only one element returned by document.getElementById; don't try to loop over it.

window.onload = function() {
    alert("Start");

    var intro = document.getElementById('EventContent');
    var els = intro.childNodes;

    for (var i = 0; i < els.length; i++) {
        alert(els[i].tagName);
    }

    alert("Stop");
};

And as for your final goal of retrieving only text, try this:

function getText(element) {
    var i, c, r = '';

    for(i = 0; c = element.childNodes[i]; i++) {
        if(c.nodeType === 3) {
            r += c.nodeValue;
        }
    }

    return r.replace(/^\s+|\s+$/g, '');
}
Ry-
  • 218,210
  • 55
  • 464
  • 476
  • Interesting, let's say that I was chopping the text down to 200 characters. So now I require the `BR`s still in the text but I can't chop a `BR` if it happens to be the 200th character. How would you recommend going about that? – windowsgm Aug 27 '12 at 15:07
  • @killianmcc: Convert newlines to spaces and breaks to newlines, then replace newlines to breaks again. – Ry- Aug 27 '12 at 16:00
  • Your code is actually perfect. Just came across a problem I have though if there is a `

    ` tag as well as `
    `s. How would I incorporate getting rid of paragraphs in your regex?

    – windowsgm Aug 28 '12 at 11:36
  • nvm found the answer here, thanks again! :) http://stackoverflow.com/questions/4232961/jquery-remove-a-tag-but-keep-innerhtml – windowsgm Aug 28 '12 at 11:54
1

Your variable 'intros' is not an array, so intros.length is undefined. You just need to iterate over intros.childNodes

<head runat="server">
<title></title>
<script type="text/javascript"src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
<script type="text/javascript">
    window.onload = function onLoad() {
        alert("Start");
        var intros = this.document.getElementById('EventContent');
        if (intros) {
            for (i = 0; i < intros.childNodes.length; i++) {
                alert(intros.childNodes[i].tagName)
            }
        }
        alert("Start");
    }

</script>
</head>
<body>
<form id="form1" runat="server">
<div id="EventContent">
<br />
<br />
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
<br />
</div>
</form>
</body>

That said, if you're using jquery anyway, you might want to try it this way.

That said, instead of iterating over the child nodes, I'd probably just get the contents of the div (div.innerHTML), regex the <br />'s and replace with a newline, then use substring to split out the the first 30 characters from the rest.

Oh, and as suggested above, use a debugger.

Community
  • 1
  • 1
Matthew Smith
  • 1,287
  • 1
  • 9
  • 19
  • Thanks for the reply, I'm leaning towards your method using innerhtml and substring but I'm unsure how to apply the 'regex the
    s' link to javascript, as that answer is actually written server side. How would I rewrite that regex expression in javascript?
    – windowsgm Aug 27 '12 at 15:17
  • Click on the link I supplied. I pointed you to another SE question. – Matthew Smith Aug 27 '12 at 23:01