Using readAsArrayBuffer()
will read the source as a "pure" byte-range independent of what the data represents and its byte-order.
Using readAsText()
without any encoding options will take two and two bytes from the source, assume and convert to a single UTF-16 (or UCS-2) character which will produce a completely different result, as you noticed.
If you know the source is in for example UTF-8 text format you can read it using the optional encoding options with readAsText(blob[, encoding])
(see supported encoding types).
Any common single-byte encoding page should suffer, in that case, as MD5 signatures as text are always within the ASCII range - the main issue then, is that it needs to be read as single byte and not double as with UTF-16/USC-2.
A different problem could be byte-order. For this case an alternative is to read it as ArrayBuffer and then use TextDecoder (see example answer) with correct byte-order (there is a BOM option available (ignoreBOM
) for this approach), e.g. little-endian or big-endian (denoted as "le" and "be", f.ex. "utf-16be", in the previous linked encoder types).