1

I am working with a binary file that I am trying to parse through and store as variables using NodeJs. I have the file currently in a Buffer. This Part of the file according to the requirements document is UTF-32.

41 00 00 00  55 00 00 00  54 00 00 00  4F 00 00 00 
31 00 00 00  45 00 00 00  30 00 00 00  33 00 00 00 
38 00 00 00  31 00 00 00  31 00 00 00  36 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
00 00 00 00

var string_UserName = data.toString('utf-8', 96, 179);
console.log('User Name: ' + string_UserName);
//User Name: A   U   T   O   1   E   0   3   8   1   1   6

var string_UserName = data.toString('utf-16le', 96, 179);
console.log('User Name: ' + string_UserName);
//User Name: A U T O 1 E 0 3 8 1 1 6

var string_UserName = data.toString('utf-32le', 96, 179);
console.log('User Name: ' + string_UserName);
//buffer.js:387
      throw new TypeError('Unknown encoding: ' + encoding);
      ^
TypeError: Unknown encoding: utf-32le
    at Buffer.slowToString (buffer.js:387:17)
    at Buffer.toString (buffer.js:399:31)

According to the Node Documentation for Buffer there isn't something built in to convert to string utf-32.

Is there a NPM module out there that can extend BUFFER to allow this conversion or do I need to write a buffer.prototype.toString() function to extend toString(), that will allow this conversion to take place? If so does anyone already have one they use?


Here is a HEX dump of the first 256/FF bytes of the file.

AF 03 00 00  D0 00 00 00  16 81 03 1E  0A 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
41 00 00 00  55 00 00 00  54 00 00 00  4F 00 00 00 
31 00 00 00  45 00 00 00  30 00 00 00  33 00 00 00 
38 00 00 00  31 00 00 00  31 00 00 00  36 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
00 00 00 00  32 30 31 36  30 31 32 32  31 32 33 35 
34 32 30 30  30 30 30 00  00 00 00 00  00 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 
00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00

After using @vlad's answer

Iconv = require('iconv').Iconv;
var convertUTF32 = new Iconv('UTF-32', 'UTF-8');
var string_UserName = convertUTF32.convert(data.slice(96, 179));
console.log('User Name: ' + string_UserName);

I am getting an error from ICONV:

C:\nodeCode\node_modules\iconv\lib\iconv.js:145
          throw errnoException('EINVAL', 'Incomplete character sequence.');
          ^

Error: Incomplete character sequence.
    at errnoException (C:\nodeCode\node_modules\iconv\lib\iconv.js:169:13)
    at Object.convert (C:\nodeCode\node_modules\iconv\lib\iconv.js:145:17)
    at Iconv.convert (C:\nodeCode\node_modules\iconv\lib\iconv.js:59:12)
    at C:\nodeCode\metaProc.js:49:37
    at FSReqWrap.readFileAfterClose [as oncomplete] (fs.js:404:3)
    at fs.js:312:11
    at nextTickCallbackWith0Args (node.js:456:9)
    at process._tickCallback (node.js:385:13)

Someone suggested that it was because data.slice(96, 179) was not a multiple of 4 but I can't understand that since 96 if the first byte and 179 is the last byte of a 4 Byte group (176, 177, 178, 179). 180 would be the start of the next byte not the end.

Any help would be appreciated.


Finally figured out what I was doing wrong. While @Vlad was mostly right. I didn't have the //TRANSLIT//IGNORE in there and had to shift the end index 1 as well since buffer.slice appears to be Half-Closed Interval [96, 180) and not inclusive [96, 179] like I thought.

What I really needed was:

Iconv = require('iconv').Iconv;
var iconv32 = new Iconv('UTF-32LE', 'UTF-8//TRANSLIT//IGNORE');
var string4_UserName = iconv32.convert(data.slice(96,180)).toString('utf-8');
console.log('User Name: ' + string4_UserName);
shaun
  • 1,223
  • 1
  • 19
  • 44

1 Answers1

2

Yes, there is an iconv npm module often used for buffer decoding.

According to the its README, you can try:

var Iconv  = require('iconv').Iconv;
var iconv = new Iconv('UTF-32LE', 'UTF-8');
var string_UserName = iconv.convert(data.slice(96, 179)).toString('utf8');
console.log('User Name: ' + string_UserName);

Let me know if it works for you.

Vlad Holubiev
  • 4,876
  • 7
  • 44
  • 59
  • When I try to npm install iconv, it tells me I can't find Python. > node-gyp rebuild C:\nodeCode\node_modules\iconv>if not defined npm_config_node_gyp (node "C:\Program Files\nodejs\node_modules\npm\bin\node-gyp-bin\\..\..\node_modules\node-gyp\bin\node-gyp.js" rebuild ) else (node rebuild ) gyp ERR! configure error gyp ERR! stack Error: Can't find Python executable "python", you can set the PYTHON env variable. I was trying to find something that could be done without having to bring in a compiler. – shaun Mar 29 '16 at 21:26
  • @shaun take a look at http://stackoverflow.com/a/21366601/2727317 I guess you're using Windows – Vlad Holubiev Mar 29 '16 at 21:30
  • 1
    After modifying my copy of VS 2015 to include C++. I finally got npm install iconv to complete. However when I use `Iconv = require('iconv').Iconv; var convertUTF32 = new Iconv('UTF-32LE', 'UTF-8'); var string_UserName = convertUTF32.convert(data.slice(96, 179)); console.log('User Name: ' + string_UserName);` All I get is an Error. `Error: Incomplete character sequence.` Same Error when I change it to `var string_UserName = convertUTF32.convert(data.slice(96, 179)).toString('utf8');` – shaun Mar 30 '16 at 14:36