0

Is there an easy way to detect the Unicode encoding (UTF-8 or UTF-16) of a text file?

Mike
  • 1,080
  • 1
  • 9
  • 25
Richi RM
  • 829
  • 3
  • 11
  • check this out: https://stackoverflow.com/questions/49546403/php-checking-if-string-is-utf-8-or-utf-16le – Sebastian Thadewald Nov 20 '21 at 18:07
  • If it is UTF-16, it must have a initial BOM. That BOM bytes are invalid in UTF-8. UTF-16BE or UTF-16LE: Unicode tell you you must have off-band specification of the file. And I recommend you not to guess, but check off-band data (headers or so). On special crafted data (as automatic attack), you may guess wrongly and that could create a security vulnerability. – Giacomo Catenazzi Nov 22 '21 at 08:21
  • Does this answer your question? [Detect charset of string in PHP (UTF-8 or Windows-1256)](https://stackoverflow.com/questions/15188509/detect-charset-of-string-in-php-utf-8-or-windows-1256) – Jakub Nov 22 '21 at 18:51

0 Answers0