2

How to check for empty text from a rich text editor?

I have a Rich text, similar to this one where I am typing.

By default, the value is set to <br> so, in Java when i check for request.getParameter("desc"); I will get the value as <br>

I want to check for empty string, including any html tags like just <br><hr> etc

Is this possible?

Andrew Thompson
  • 168,117
  • 40
  • 217
  • 433
Nithy
  • 23
  • 1
  • 4
  • 3
    Yes, it is. what have you tried so far? – Marco Forberg May 27 '13 at 10:58
  • Tried `descStr = StringEscapeUtils.escapeHtml(descStr);` but this converts to `<br>` for a `
    `. I am expecting an empty space or line, and I can do `descStr.length==0` to check if its empty, so this didnt work for me. is there an easy way to fix, if not I will have to change the solution.
    – Nithy May 27 '13 at 11:14
  • @Andrew, would you recommend storing the rich text editor's content in a hidden Field, as anyone types, and pass it to backend could do the trick. I never tried, but just came up in mind, or will it be the same anyway. – Nithy May 27 '13 at 11:26

2 Answers2

4

Use a HTML parser like Jsoup.

String text = Jsoup.parse(html).text();

if (text.isEmpty()) {
    // No text.
}

Additional advantage is that it can also help you with sanitizing HTML to avoid XSS attacks when a malicious enduser enters e.g. a <script> in your text area. You were also checking on that, right?

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
  • Thanks for your quick reply. That was a precise solution, and its nice to know about that library. Cleaning HTML was the other issue I had and thought of posting next, and you saved my night. Thanks again. – Nithy May 28 '13 at 08:01
1

Maybe simple-minded, but just remove all tag words (includes image and button).

public static boolean isEmpty(String text) {
    return text.replaceAll("<[^>]+>", "").trim().isEmpty();
}

Maybe with a replaceAll removal of whitespace and line breaks.

Assumes that a non-tag < is given as entity &lt;.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • That was interesting. I combined the Jsoup suggested by BalusC to check embedded html content and your suggestion to clean up the tags in text, and did the trick. Thanks for this. But I think Jsoup seems to have these built in already using WhiteList or so, and looks to have clean way to clean the html tags, so I will proceed with that solution. Thanks for your response too. – Nithy May 28 '13 at 08:28
  • is there a solution for not skipping images? – ItsTheBat Oct 01 '20 at 07:58
  • @Vi_Hari the html tags for images `` could be kept with a regex negative lookahead on "img" but that does not make it as normal image, or RTF image or such. Either again a conversion to HTML with prepending `""` or converting them to RTF (if that is your intended format); https://stackoverflow.com/questions/1490734/programmatically-adding-images-to-rtf-document – Joop Eggen Oct 01 '20 at 12:57