4

I use substr() on a string to output a set of characters from a message. My problem is that if the string contains HTML I would get the tags in the text.

Is it possible to parse the HTML before using substr() to only extract the text from the message itself?

Example 1:

var string = "Hi";

alert(string.substr(0, 7));

Alert:

Hi


Example 2:

var string = "<br/>Hi";

alert(string.substr(0, 7));

Alert:

<br/>Hi

Community
  • 1
  • 1
Sam Pettersson
  • 3,049
  • 6
  • 23
  • 37
  • How does `string` get populated? You could do something like `$(string).text()`, but there's probably a much more efficient way that doesn't involve `jQuery`. – crush Jan 08 '14 at 16:55
  • 1
    Have a look at the accepted answer to this question: http://stackoverflow.com/questions/10585029/parse-a-html-string-with-js – richsilv Jan 08 '14 at 16:56

9 Answers9

4

Using jQuery.text() method, you can extract only the text.

var str = "<br/>Hi"

$('<p>' + str + '</p>').text(); //Wrap your input in a p element first to ensure you get the text if your string isn't wrapped in HTML.
crush
  • 16,713
  • 9
  • 59
  • 100
2

You can remove the html via a regex, see the answer here and you can do something like this:

String.prototype.removeHtml=function(){
  return this.replace(/(<([^>]+)>)/ig,"");
}

var string = "<br/>Hi";

alert(string.removeHtml().substr(0, 7));

Fiddle: http://jsfiddle.net/fx3MJ/

Remove HTML Tags in Javascript with Regex

Community
  • 1
  • 1
Hattan Shobokshi
  • 687
  • 5
  • 13
2

Yes using this:

var string = "<br/>Hi";
var stringStripped = string.replace(/(<([^>]+)>)/ig,"");
alert(stringStripped);

http://jsfiddle.net/hutchonoid/7T56N/

hutchonoid
  • 32,982
  • 15
  • 99
  • 104
1

If you input is read in the DOM, you can use jQuery to get just the text. Example :

<p id="foo"><br />Hi</p>

var string = $('#foo').text();

alert(string.substr(0, 7)); //says "Hi"

See jQuery doc for reference.

Edit: If your input comes from somewhere else, you can still use jQuery, but there may be better solution.

var input = "<br />Hi";
var fakeElement = $('<div>' + input + '</div>');

var string = fakeElement.text();

alert(string.substr(0, 7)); //says "Hi"
crush
  • 16,713
  • 9
  • 59
  • 100
Johnny5
  • 6,664
  • 3
  • 45
  • 78
1

You could replace html-tag with regex pattern

var string = string.replaceAll("<[^>]*>", "");
makallio85
  • 1,366
  • 2
  • 14
  • 29
1

If you want to get the text with HTML tag, try to use .innerHTML from jQuery.

<p id="foo"><br />Hi</p>


$("#foo")[0].innerHTML;

Console: "<br />Hi"

Mephisto07
  • 39
  • 1
0

You could leverage jQuery's text() function:

 $('<div>').append('<br />hi').text();

This creates a new element, to which you can append your markup, and then use text() to extract the text. A little round-about, but it works.

I'd recommend a regex solution over this.

Mister Epic
  • 16,295
  • 13
  • 76
  • 147
0

try this:

var string = "<br/Hi>".replace(/&#39;/g, '\'').replace(/&amp;/g, '&').replace(/&lt;/g, '<').replace(/&gt;/g, '>').replace(/&quot;/g, '"').replace(/&apos;/g, '\'');
alert(string);
Belvi Nosakhare
  • 3,107
  • 5
  • 32
  • 65
0

Try this

var string = "<br/>Hi";

alert($('<span>'+string+'</span>').text().substr(0, 7));

FIDDLE

Pranav C Balan
  • 113,687
  • 23
  • 165
  • 188