3

I'm dealing with a text where I find UTF-16 encoded as UTF-8 and I am unable to translate from one to the other in the R language.

For example, and looking at this codepoint (https://codepoints.net/U+D83D) representation in UTF-8 as a text string "ED A0 BD" and I want to convert it to also a text string"D8 3D".

How can I achieve this?

More info on what I want to achieve: stackoverflow.com/questions/35670238/emoji-in-r-utf-8-encodi‌​ng

Community
  • 1
  • 1
Ed.
  • 846
  • 6
  • 24
  • 1
    Are you dealing with a text file that you are attempting to read into a dataframe? – IRTFM Jan 27 '17 at 20:34
  • It is indeed a text string that contains the UTF-8 representation of the character. So ED A0 BD is a text string, that comes from a file, yes. – Ed. Jan 28 '17 at 14:31
  • I asked if it were a text file and you are telling me that it is a "string" which is not helpful. If it is an R character vector that is one thing. If it is a file to be read that is a different matter. Saying it is "UTF-16 encoded as UTF-8" makes no sense to me. You need to post a link to a file that has this and it probably need to have the opening bytes to ascertain its encoding. – IRTFM Jan 28 '17 at 16:59
  • I think it does not matter if it comes from a file. The hex code is in a text string, and I want to convert that "ED A0 BD" which is hex UTF-8 for the codepoint "D8 3D" in UTF-16. Or at least, this is what I understand from here: http://stackoverflow.com/questions/35670238/emoji-in-r-utf-8-encoding – Ed. Jan 29 '17 at 17:31

0 Answers0