10

I have a requirement to convert plain text to and from RTF (RichText Format) using javascript.

I am looking for a function for each conversion, and I am not looking to use a library.

Conversion from plain to RTF

The formatting styles and colours are not important, all that matters is that the plain text i converted into a valid RTF format

Conversion from RTF to plain

Again, the styles are not important. They can be completed removed. All that is required is that all text data remains (no loss of entered data)

musefan
  • 47,875
  • 21
  • 135
  • 185

2 Answers2

13

I found a c# answer here which was a good starting point, but I needed a Javascript solution.

There is no guarantee that these are 100% reliable, but they seem to work well with the data I have tested on.

function convertToRtf(plain) {
    plain = plain.replace(/\n/g, "\\par\n");
    return "{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang2057{\\fonttbl{\\f0\\fnil\\fcharset0 Microsoft Sans Serif;}}\n\\viewkind4\\uc1\\pard\\f0\\fs17 " + plain + "\\par\n}";
}

function convertToPlain(rtf) {
    rtf = rtf.replace(/\\par[d]?/g, "");
    return rtf.replace(/\{\*?\\[^{}]+}|[{}]|\\\n?[A-Za-z]+\n?(?:-?\d+)?[ ]?/g, "").trim();
}

Here is a working example of them both in action

Community
  • 1
  • 1
musefan
  • 47,875
  • 21
  • 135
  • 185
  • or [here](https://github.com/lazygyu/RTF-parser) you've got a javascript RTF parser by lazygyu – Kaiido Apr 28 '15 at 15:48
  • I would give you 100 upvotes for this if I could, thank you! – user3897392 Jul 21 '16 at 19:07
  • 2
    This won’t work well for Unicode input (non-English chars, typographic quotation marks, etc). – mirabilos Jun 14 '17 at 14:52
  • @mirabilos: Yeah I would have expected there would be problems with things like that. Unfortunately I don't know enough about RTF format to provide anything more robust – musefan Jun 14 '17 at 15:27
  • rtf.replace(/\\'[0-9a-zA-Z]{2}/g, "").trim(); will remove some of the unwanted unicode characters – Tjad Clark Jun 15 '17 at 17:30
  • 1
    @TjadClark those Unicode characters are **wanted** and **needed** to be correctly represented in the output – mirabilos Jun 16 '17 at 11:33
  • I suppose it depends from app to app, that was just a general expression for unicode hex. My application seems to permit me to ignoring that. Thanks for the heads up though. – Tjad Clark Jun 16 '17 at 19:52
0

Adding onto Musefan's answer for some hex characters

function convertToPlain(rtf) {
    rtf = rtf.replace(/\\par[d]?/g, "");
    rtf = rtf.replace(/\{\*?\\[^{}]+}|[{}]|\\\n?[A-Za-z]+\n?(?:-?\d+)?[ ]?/g, "")
    return rtf.replace(/\\'[0-9a-zA-Z]{2}/g, "").trim();
}
Tjad Clark
  • 552
  • 3
  • 17