Check for empty text from a rich text editor

Question

How to check for empty text from a rich text editor?

I have a Rich text, similar to this one where I am typing.

By default, the value is set to <br> so, in Java when i check for request.getParameter("desc"); I will get the value as <br>

I want to check for empty string, including any html tags like just <br><hr> etc

Is this possible?

Tried `descStr = StringEscapeUtils.escapeHtml(descStr);` but this converts to `<br>` for a `
`. I am expecting an empty space or line, and I can do `descStr.length==0` to check if its empty, so this didnt work for me. is there an easy way to fix, if not I will have to change the solution. — Nithy, May 27 '13 at 11:14
@Andrew, would you recommend storing the rich text editor's content in a hidden Field, as anyone types, and pass it to backend could do the trick. I never tried, but just came up in mind, or will it be the same anyway. — Nithy, May 27 '13 at 11:26

score 4 · Accepted Answer · answered May 27 '13 at 13:56

4

Use a HTML parser like Jsoup.

String text = Jsoup.parse(html).text();

if (text.isEmpty()) {
    // No text.
}

Additional advantage is that it can also help you with sanitizing HTML to avoid XSS attacks when a malicious enduser enters e.g. a <script> in your text area. You were also checking on that, right?

answered May 27 '13 at 13:56

BalusC

1,082,665
372
3,610
3,555

Thanks for your quick reply. That was a precise solution, and its nice to know about that library. Cleaning HTML was the other issue I had and thought of posting next, and you saved my night. Thanks again. – Nithy May 28 '13 at 08:01

score 1 · Answer 2 · answered May 27 '13 at 12:17

1

Maybe simple-minded, but just remove all tag words (includes image and button).

public static boolean isEmpty(String text) {
    return text.replaceAll("<[^>]+>", "").trim().isEmpty();
}

Maybe with a replaceAll removal of whitespace and line breaks.

Assumes that a non-tag < is given as entity <.

answered May 27 '13 at 12:17

Joop Eggen

107,315
7
83
138

That was interesting. I combined the Jsoup suggested by BalusC to check embedded html content and your suggestion to clean up the tags in text, and did the trick. Thanks for this. But I think Jsoup seems to have these built in already using WhiteList or so, and looks to have clean way to clean the html tags, so I will proceed with that solution. Thanks for your response too. – Nithy May 28 '13 at 08:28
is there a solution for not skipping images? – ItsTheBat Oct 01 '20 at 07:58
@Vi_Hari the html tags for images `` could be kept with a regex negative lookahead on "img" but that does not make it as normal image, or RTF image or such. Either again a conversion to HTML with prepending `""` or converting them to RTF (if that is your intended format); https://stackoverflow.com/questions/1490734/programmatically-adding-images-to-rtf-document – Joop Eggen Oct 01 '20 at 12:57

Check for empty text from a rich text editor

2 Answers2