I am receiving an annotated json
from the backend, which i need to display in UI.
The json
contains strings tagged according to position and length in the content.
It may contain characters like \t \n
or extra whitespaces
, also html entities, unicode
etc. When I try to display it in HTML, this information is lost, html entities
are converted to respective values, whitespaces
are reduced to single, unicode
is converted to corresponding character.
I want to display the content as is, because i need to highlight the annotations and I am allowing the user to tag things as well, and if he tags them in the displayed HTML, the position and length would be different from the original json.
Example:
json:
{
"content": " \tHi there   how are you?"
}
This is displayed as "Hi there"
, and so if i want to highlight 'how'
, which is tagged at position 17, in the UI i would get it at position 10 or 11.
Also if a user wants to tag 'are'
, it would get tagged at 14, while the server would expect it to be tagged at 21.
EDIT:
this is what i have till now:
1) all html entities are converted as:
> --> &gt
so that they get displayed as >
in the rendered HTML and not >
2) \t, \r , \n are converted as:
\t --> \\t
, so that it gets displayed as \t
3) i can also recognize unicode characters and convert them:
\u --> \\u
, so that they get displayed as it is
but there are some other issues like, extra whitespaces, foreign characters, patterns like \x etc. i don't think i have a comprehensive list of everything, and sooner or later it might break.