13

In tinyMCE, Is there any way to get the plain text instead of HTML text?

Siva Arunachalam
  • 7,582
  • 15
  • 79
  • 132

3 Answers3

31

Try this:

var myText = tinyMCE.activeEditor.selection.getContent({ format: 'text' });
luchopintado
  • 899
  • 11
  • 15
7
var rawtext = tinyMCE.activeEditor.getBody().textContent;
jbr
  • 6,198
  • 3
  • 30
  • 42
Kapil Bodke
  • 71
  • 1
  • 2
  • 3
    problem with this one is that it converts the `
    ` to `''` or in other words to nothing. This will cause words to collapse such as `[...] end of line with not periodAnother paragraph`
    – jimasun Feb 27 '17 at 16:51
2

I just tried this approach:

editor.getContent()
   .replace(/<[^>]*>/ig, ' ')
   .replace(/<\/[^>]*>/ig, ' ')
   .replace(/&nbsp;|&#160;/gi, ' ')
   .replace(/\s+/ig, ' ')
   .trim();
  • Replaces both opening and closing html tags with space
  • Replaces various known special characters with space (add yours as well)
  • Replaces multiple spaces with a single space

Worked reasonably well, but it is obviously not perfect. I need only an approximation of plain text for purposes of word counting, so I am willing to ignore corner cases such as having part of the word bold or italic (replacement above for <b>a</b><i>x</i> will produce two separate words a b instead of ab).

It is an extension of Regular expression to remove HTML tags from a string

Hope that helps.

Community
  • 1
  • 1
Dmitry Frenkel
  • 1,708
  • 11
  • 17
  • 2
    Beware regex for parsing HTML. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 The pony, he comes. – MushinNoShin Jan 07 '16 at 13:40