I am working with a binary file that I am trying to parse through and store as variables using NodeJs. I have the file currently in a Buffer. This Part of the file according to the requirements document is UTF-32.
41 00 00 00 55 00 00 00 54 00 00 00 4F 00 00 00
31 00 00 00 45 00 00 00 30 00 00 00 33 00 00 00
38 00 00 00 31 00 00 00 31 00 00 00 36 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
var string_UserName = data.toString('utf-8', 96, 179);
console.log('User Name: ' + string_UserName);
//User Name: A U T O 1 E 0 3 8 1 1 6
var string_UserName = data.toString('utf-16le', 96, 179);
console.log('User Name: ' + string_UserName);
//User Name: A U T O 1 E 0 3 8 1 1 6
var string_UserName = data.toString('utf-32le', 96, 179);
console.log('User Name: ' + string_UserName);
//buffer.js:387
throw new TypeError('Unknown encoding: ' + encoding);
^
TypeError: Unknown encoding: utf-32le
at Buffer.slowToString (buffer.js:387:17)
at Buffer.toString (buffer.js:399:31)
According to the Node Documentation for Buffer there isn't something built in to convert to string utf-32.
Is there a NPM module out there that can extend BUFFER to allow this conversion or do I need to write a buffer.prototype.toString() function to extend toString(), that will allow this conversion to take place? If so does anyone already have one they use?
Here is a HEX dump of the first 256/FF bytes of the file.
AF 03 00 00 D0 00 00 00 16 81 03 1E 0A 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
41 00 00 00 55 00 00 00 54 00 00 00 4F 00 00 00
31 00 00 00 45 00 00 00 30 00 00 00 33 00 00 00
38 00 00 00 31 00 00 00 31 00 00 00 36 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 32 30 31 36 30 31 32 32 31 32 33 35
34 32 30 30 30 30 30 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
After using @vlad's answer
Iconv = require('iconv').Iconv;
var convertUTF32 = new Iconv('UTF-32', 'UTF-8');
var string_UserName = convertUTF32.convert(data.slice(96, 179));
console.log('User Name: ' + string_UserName);
I am getting an error from ICONV:
C:\nodeCode\node_modules\iconv\lib\iconv.js:145
throw errnoException('EINVAL', 'Incomplete character sequence.');
^
Error: Incomplete character sequence.
at errnoException (C:\nodeCode\node_modules\iconv\lib\iconv.js:169:13)
at Object.convert (C:\nodeCode\node_modules\iconv\lib\iconv.js:145:17)
at Iconv.convert (C:\nodeCode\node_modules\iconv\lib\iconv.js:59:12)
at C:\nodeCode\metaProc.js:49:37
at FSReqWrap.readFileAfterClose [as oncomplete] (fs.js:404:3)
at fs.js:312:11
at nextTickCallbackWith0Args (node.js:456:9)
at process._tickCallback (node.js:385:13)
Someone suggested that it was because data.slice(96, 179)
was not a multiple of 4 but I can't understand that since 96 if the first byte and 179 is the last byte of a 4 Byte group (176, 177, 178, 179). 180 would be the start of the next byte not the end.
Any help would be appreciated.
Finally figured out what I was doing wrong. While @Vlad was mostly right. I didn't have the //TRANSLIT//IGNORE
in there and had to shift the end index 1 as well since buffer.slice appears to be Half-Closed Interval [96, 180) and not inclusive [96, 179] like I thought.
What I really needed was:
Iconv = require('iconv').Iconv;
var iconv32 = new Iconv('UTF-32LE', 'UTF-8//TRANSLIT//IGNORE');
var string4_UserName = iconv32.convert(data.slice(96,180)).toString('utf-8');
console.log('User Name: ' + string4_UserName);