I am creating a blog engine and it includes a <textarea>
which takes in the input of the whole article.
I then use ajax and store it to the Text
variable provided by the GAE datastore
The Problem: If a user copies the text from a word document, them I see various random characters on the screen when embedded on the web page. I know this is because the word file uses XML encoding and a HTML page uses utf-8 encoding(in my case)
The question: How do I change the encoding of the inputted text? Or how can I avoid the XML encoding? Or if changing the encoding of my web page might help solve this problem?
Points to be noted: I want to make it automated.. I have read on Google that you should 1st copy the text to some simple text editor which formats the encoding and them copy it to the web page. But this option is not feasible for me.
Also I have used weebly before, and that time I copied text from a word file, if someone knows how weebly manages the encoding conflict!
Answers are expected in java :)