0

I'm grabbing the text from an element in essentially this manner:

var paragraphElement = document.getElementById("para");
var paragraphText = paragraphElement.innerText;

However, I don't know how to grab the line-break characters from the innerText... or perhaps innerText loses this information and I need to grab it another way to extract the information?

WFDr
  • 11
  • 6
  • 1
    Hi! I'm afraid it's not clear what you're asking. Are you saying there are line break characters actually in the text, or that you want to know where the line breaks are occurring because of word-wrap? Please update your question with a **runnable** [mcve] demonstrating the problem using Stack Snippets (the `[<>]` toolbar button; [here's how to do one](https://meta.stackoverflow.com/questions/358992/)). – T.J. Crowder Sep 09 '18 at 10:48
  • if your talking about newline character its \n – Chris Li Sep 09 '18 at 10:49
  • Yes, I want to find the newline characters, but when I console.log my paragraphText variable it doesn't have the \n, it just has plain text. – WFDr Sep 09 '18 at 10:50
  • Yes T.J. I would like to find where the newline character is occurring in the paragraph as it is represented on the page and extract this information. Sorry for my bad wording. – WFDr Sep 09 '18 at 10:51
  • The line-breaks are there (as invisible newline characters). To make them visible in HTML, you need to apply `white-space: pre-wrap;` to the element displaying the text. – connexo Sep 09 '18 at 10:53
  • Thank you connexo. I will try to make a 'stacksnip' with this in it and hopefully this will solve the problem and I will give you a green tick and an up arrow afterwards. – WFDr Sep 09 '18 at 10:55
  • Sorry for this but I cannot upvote you at my current reputation. I would like to use these but I cannot see a newline and I need to convert to ellipsis (I don't need people to do that part I can do it) – WFDr Sep 09 '18 at 11:02
  • Your question asks for something different that the marked answer answers. Your question is not "how to grab line-break" but "how to grab `
    ` tags" if that answer is correct.
    – connexo Sep 09 '18 at 11:18
  • Oh? Sorry. It outputted the /n so I thought it was right. I don't think that answer would work then. My brain is very full right now, is it ok if I take a rest and look at your answers later? I need to understand how it all works. – WFDr Sep 09 '18 at 11:24

2 Answers2

0

innerHTML gets or sets the HTML or XML markup contained within the element.

Are you looking for something like this?, if not please let me know.

var paragraphElement = document.getElementById("para");
//var paragraphText = paragraphElement.innerText;
var withBreakLines = paragraphElement.innerHTML.split('<br>');
var output='';
for(var i=0;i<withBreakLines.length;i++){
  output = output+withBreakLines[i]+"\\n"+" ";
}
console.log(output);
//document.documentElement.outerHTML
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width">
  <title>JS Bin</title>
</head>
<body>
  <p id="para">hfjksdjhiusdfj<br>ginins<br>sdfsdfsd</p>
</body>
</html>
chintuyadavsara
  • 1,509
  • 1
  • 12
  • 23
  • I like this code but I would like to see the \n or the Unicode character on the screen. – WFDr Sep 09 '18 at 10:59
  • Where do you want to see the `\n` ? from the inner text of the element? – chintuyadavsara Sep 09 '18 at 11:01
  • Anywhere that I can see it. I will need to convert it to an ellipsis after and do some other logic so it skips n \ns first (pun initially not intended), but as long as I can see a visual representation of the \n I can take things from there. – WFDr Sep 09 '18 at 11:04
  • So, like, "Paragraph\nbreak\nstatement\n" would be really nice to see. Although not sure if the last \n would really exist? – WFDr Sep 09 '18 at 11:06
  • Thank you ChintuYadavSara. I am sorry for being a bother to you. – WFDr Sep 09 '18 at 11:12
  • Thank you ChintuYadavSara, this is wonderful. Exactly what I needed! – WFDr Sep 09 '18 at 11:12
  • @WFDr is this not the right solution? Why did you removed from accepted Answer? Let me know if I can help you – chintuyadavsara Sep 09 '18 at 11:26
  • connexo pointed out the answer is grabbing
    elements which I am very full of thoughts at the moment so I didn't notice. I would like the Unicode character for newline extracted from the HTML I think is what I need. Sorry for this.
    – WFDr Sep 09 '18 at 11:27
  • Please do not stress yourself too much ChintuYadavSara – WFDr Sep 09 '18 at 11:36
  • @WFDr did you get the correct answer? If yes, I would have ignored it. And thanks for your concern :) – chintuyadavsara Sep 09 '18 at 11:37
  • The original user was using an old section of CSS Tip and Tricks website to insert ellipses after a specific number of lines, on linebreak. This was using old, deprecated, webkit only CSS. I am trying to fix it for them properly using JavaScript, which is the right way to do it. I got all of the rest of the code in place, the only thing I was stuck on was grabbing the linebreak itself. I have the code on another computer that is my main work laptop. I am on my home computer at the moment. Not sure if that info helps. – WFDr Sep 09 '18 at 11:39
  • At the moment this is grabbing
    elements, which I can do. What I want, ideally, is to be able to see the full string in Unicode (or a similar format) and manipulate each element individually
    – WFDr Sep 09 '18 at 11:45
0

The line-breaks are there (as invisible newline characters). To make them visible in HTML, you need to apply white-space: pre-wrap; to the element displaying the text.

To find the linebreaks, use this regular expression:

/(\r\n|\n|\r)

text.addEventListener('input', e => {
  test.textContent = text.value.replace(/(\r\n|\n|\r)/gm,"…");
})
#test { white-space: pre-wrap; border : 1px solid #999; min-height: 1em; }
<textarea style="width: 400px; height: 120px" id="text"></textarea>
<h2>Text will show up in the subsequent div when you type</h2>
<div id="test"></div>
connexo
  • 53,704
  • 14
  • 91
  • 128
  • This is nice but I cannot see the newline? Could it output as a \n or the appropriate Unicode character? – WFDr Sep 09 '18 at 10:59
  • Why do you need to see it? It renders properly without being visible. – connexo Sep 09 '18 at 10:59
  • Added code to make linebreaks visible by replacing them by `\\n\n`. Or do you want the actual linebreaks to be removed completely and simply replaced by a character? – connexo Sep 09 '18 at 11:04
  • I would like to capture the \n and use it in a controller and to help the other user convert the \ns to ellipses. At the moment they are using a deprecated, webkit only hack to do the conversions which isn't good tbh – WFDr Sep 09 '18 at 11:07
  • Modified code so linebreaks get replaced by an ellipsis `…` character. – connexo Sep 09 '18 at 11:07
  • Sure they are. Enter text in the textarea, press enter, more text: `sdfgdfgdfgd…dfgdsgs…sdfgsdfg…` – connexo Sep 09 '18 at 11:15
  • Seems to have been a caching issue. I'll upvote when I hit the right rep. Thank you very much for the time you have spent here. Much appreciated – WFDr Sep 09 '18 at 11:18
  • You asked *how to grab **linebreaks***, why do you pick an answer that replaces `
    ` tags? That's not what you asked for.
    – connexo Sep 09 '18 at 11:22
  • I am sorry for this connexo. I saw \n in the output and assumed it was really grabbing \ns. I am sorry for this. Maybe what I really need is Unicode output. I have many things to think about at the moment. I am sorry for this. – WFDr Sep 09 '18 at 11:29
  • Your code seems a little complicated for what I need. Could you isolate the part where the linebreak becomes available and manipulable? Is it something to do with your event listener? (also, if you wish to move to chat you can but I don't have enough rep) – WFDr Sep 09 '18 at 11:52
  • Complicated? A single line of code appears complicated to you? – connexo Sep 09 '18 at 12:03
  • Sometimes, yes, strangely. What's the event listener for? I don't understand how listening for text input relates to the query. – WFDr Sep 09 '18 at 12:05
  • Assuming you have the element with the text in a variable named `text`, you can get the text inside the element and replace newline characters with ellipsis using `text.value.replace(/(\r\n|\n|\r)/gm,"…");` – connexo Sep 09 '18 at 12:09
  • So it's the regular expression that exposes the newline character for manipulation? – WFDr Sep 09 '18 at 12:37
  • The regular expression is used to **find** the newline characters. `String.prototype.replace()` method takes two arguments: The first is a string to replace (or a regular expression that matches the string to replace), and the second is the string to replace it with. The parameter `/gm` tells `replace()` to not only replace the first occurrence, but all. – connexo Sep 09 '18 at 12:41
  • Yes, I'm aware of the replace function, I just didn't realise that it exposes the newline character when inner text is parsed. Or is it when inner HTML is parsed? Or do you need that CSS as well? It still seems a bit complicated to me. TBH I'd rather know how to grab the raw Unicode, but maybe that's just how my brain works. – WFDr Sep 09 '18 at 12:43
  • `Element.prototype.textContent()` gets you the concatenated text nodes of an element, HTML is stripped out. `Element.prototype.innerHTML()` gets you the content of an element including all HTML tags. On the differences between the non-standard `innerText` and the standard `textContent` read up on https://stackoverflow.com/questions/35213147/difference-between-textcontent-vs-innertext - essentially innerText will only get visible text (that does not have `display: none` or `visibility: hidden` whereas textContent will get all text nodes. – connexo Sep 09 '18 at 12:45
  • I tried outputting the innerHTML to the console in a JSFiddle and it just outputs the raw string without linebreak character – WFDr Sep 09 '18 at 13:18
  • You need to understand that the linebreak character *is not visible* . It's hidden in the binary representation of the text you see. There is no visible `\n` inside it - `\n` is just the regular expression term that finds you these linebreaks, or that you can add to a text so a text parser like the browser knows where to insert linebreaks. – connexo Sep 09 '18 at 13:19
  • In which case I wish to gain access to the level at which this data is visible. Particularly since Manipulating HTML With Regular Expressions Is Bad™ – WFDr Sep 10 '18 at 20:06
  • Honestly I have no idea what you want. – connexo Sep 11 '18 at 05:47