0

if I create for example a page with the following content:

<body>
<p>
ABC DEF<p>GHI</p>
JKL<br>MNO
</p>
</body>

Then I get in the browser:

ABC DEF

GHI

JKL
MNO

But when I now use $('body').text() then I get back:

ABC DEFGHI
JKLMNO

Is it possible to add an empty space between the elements? So that 'DEFGHI' and 'JKLMNO' are actually two words instead of one?

Here the link to a jsfiddle example.

Oliver
  • 79
  • 6

1 Answers1

1

Use html() instead of text() and then replace br and p tags with spaces.

var text = $('body').html();
var str = text.replace( /<br\s*[\/]?>/gi, '\r\n'); // <br> to newline
var str2 = str.replace(/<\/?p[^>]*>/g, '\r\n'); // <p> to newline
var str3 = str2.replace(/  +/g, ' '); // multiple spaces to one

console.log(str3)

Resulting exactly the same structure as the HTML, but in text

ABC DEF

GHI

JKL
MNO

if you'll replace to empty space ' ' instead of '\r\n' you'll get:

ABC DEF GHI 
JKL MNO
Alon Adler
  • 3,984
  • 4
  • 32
  • 44
  • Thank you Alon for your answer. I also test .html() but for example '&' it returns '&'. So for example 'A&B' is 'A&B'. So I would need to replace all these signs. Or is there an easier way to keep '&'? – Oliver Oct 05 '20 at 22:47
  • Basically it is not a good practice to [parse HTML with Regex](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags). Your question includes a limited scenario as you asked about specific two tags. For this kind of task - small set of tags which needs to be replaced, [Regex replacement of the tags](https://stackoverflow.com/a/1733489/908624) can be just fine. If this is not the case, I would suggest parsing HTML with `DOMParser` and use selectors to get what you want, [Like this answer](https://stackoverflow.com/a/20767587/908624), for example. – Alon Adler Oct 05 '20 at 23:01
  • Ok I see. Thank you. – Oliver Oct 05 '20 at 23:26