The issue here is that the BOM is a feature of 'UTF-16', not of 'UTF-16LE'.
Per http://unicode.org/faq/utf_bom.html#gen7:
The BE form uses big-endian byte serialization (most significant byte first), the LE form uses little-endian byte serialization (least significant byte first) and the unmarked form uses big-endian byte serialization by default, but may include a byte order mark at the beginning to indicate the actual byte serialization used.
Note that the option to include a byte order mark applies only to "the unmarked form", meaning 'UTF-16'.
So when you tell iconv
that the source encoding is 'UTF-16LE', and then the input starts with FF FE, iconv
doesn't interpret the FF FE as a redundant indication of the byte order; rather, it interprets it as U+FEFF ZERO WIDTH NO-BREAK SPACE, and tries to copy that character to the output.
You can fix that by telling iconv
that the source encoding is 'UTF-16'; then, when it sees that the input starts with FF FE, it will interpret it as a byte order mark, remove it, and interpret the rest of the input as little-endian.
So, change this:
iconv -f UTF-16LE -t UTF-8 myfile.dat -o myfile.dat_test
to this:
iconv -f UTF-16 -t US-ASCII myfile.dat -o myfile.dat_test
(Note: I've also changed the 'UTF-8' to 'US-ASCII', so that if there are any non-ASCII characters you'll get an explicit error instead of bad output.)