1

Is there a library in JAVA where I can encode HTML, but only content?

I have like

<div>Tél</div>

and I only want

<div>T&eacute;l</div>

instead of

&lt;div&gt;T&eacute;l<&lt;/div&gt;

I need this library to encode an entire HTML. I have tried library JSoup but it has bugs when handling some objects.

Thanks

Gabriel Diaconescu
  • 1,769
  • 3
  • 21
  • 31
  • 1
    Why do you want to convert characters into their HTML entities in the first place? If you're using UTF-8, that should never be necessary. – Pekka Mar 29 '11 at 17:24

1 Answers1

1

It's never a good idea to parse HTML using regex, that's a recipe for disaster.

So first look at this Q&A for HTML parsing in java: Java HTML Parsing

Once you are able to parse HTML and get internal HTML text then you can encode HTML in one of the these ways: Is there a JDK class to do HTML encoding (but not URL encoding)?

Community
  • 1
  • 1
anubhava
  • 761,203
  • 64
  • 569
  • 643