Read byte value string characters

Question

I am trying to implement a communication protocol in JavaScript (language mandatory). The communication is done in TCP/IP thanks to webtcp.

Every packet is divided this way: content size in big endian + content.

So for a packet of a size 28, let's say {"type": "success", "id": 0}, the packet sent will be 0,0,0,28,'{','"', 't', etc....

I have no problem sending packets in pure JavaScript using this syntax.

The problem is, because of WebTCP, the packet I get on my end is always a string, and when the size of a packet is between 128 and 255 (I'm guessing, I just know that it is greater than 128), the size is read wrong. I think I know where the problem is:

Here is my function which extracts the data.

function extractData() {
    // return empty string if not enough bytes to read the size of next packet
    if (!buffer || buffer.length < 4) { return ""; }

    // read size
    var size = [];
    for (var idx = 0 ; idx < 4 ; ++idx) {
        size.push(buffer.charCodeAt(idx)); // pretty sure the problem comes from here.
    }

    // size from big endian to machine endian
    size = ntohl(size);

    // return empty string if the buffer does not have the complete packet yet
    if (buffer.length < 4 + size) { return ""; }

    // copy the packet content into ret
    var ret = "";
    for (var idx = 4 ; idx < size + 4 ; ++idx) {
        ret += buffer[idx];
    }

    // the buffer removes the packet returned
    buffer = buffer.substring(size + 4);

    // return the packet content
    return ret;
}

buffer is a global variable which is filled every time data is received. ntohl is a function I got from http://blog.couchbase.com/starting-membase-nodejs (without the i offset) which takes a 4 bytes array and returns an integer.

So the line at fault would be size.push(buffer.charCodeAt(idx));, I'm guessing the charCodeAt function overflows when the character code given is greater than an ASCII value (0-127). From printing on the server side (which works, I tried in python and C++), the size sent is 130, and on the JavaScript side, the size array contains [0, 0, 0, 65533] (or something like this, I don't remember the right number. With a size of 30 I get [0, 0, 0, 30] so I know that this is supposed to work.

I have several questions :

How can I extract the raw integer value of a char in a string ?
Is there an easy way to turn the 4 first bytes into something like a bytearray ? Using only JS and jQuery.

Thanks.

Thanks but `charCodeAt` is precisely what I don't want to use because it overflows when the value is > 127. — Telz, May 04 '15 at 16:13
See [MDN on charCodeAt](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt), it can handle values up to 65536, however what matters is how exactly you transfer your char. It's possible that the output gets converted to pure ascii at some other point before reaching charCodeAt. — Etheryte, May 04 '15 at 16:19
http://stackoverflow.com/questions/17191945/conversion-between-utf-8-arraybuffer-and-string — garryp, May 04 '15 at 16:48
[This](http://stackoverflow.com/questions/10070777/characters-with-ascii-128-are-not-correctly-read-in-javascript) seems to be the same problem, whereas in my case I don't care for "utf-8" I only want the uchar value. — Telz, May 04 '15 at 17:06

score 0 · Answer 1 · answered May 04 '15 at 16:41

0

Well according to the definition of ntohl it excepts an integer and an array.

exports.ntohl = function(b, i) {
  return ((0xff & b[i + 0]) << 24) |
         ((0xff & b[i + 1]) << 16) |
         ((0xff & b[i + 2]) << 8) |
         ((0xff & b[i + 3]));
}

where b is an array and i is an integer.

You are only passing a single parameter in your code sample.

If I pass 0, it returns a value if I pass no parameter it returns 0.

Try your code with nthol (size, 0 );

Here is a fiddle demonstrating this: http://jsfiddle.net/3jxrr3k7/

answered May 04 '15 at 16:41

eddyrolo

347
1
6

I edited the function in order not to have the `i` offset (just `b[0]`, `b[1]` etc.). – Telz May 04 '15 at 16:49
What output are you getting on the size of the packet? – eddyrolo May 04 '15 at 16:52
After using `charCodeAt` I have an array like `[0, 0, 0, 30]` and after using ntohl I get 30. But when on server side I send something like `[0,0,0,130]` I read `[0,0,0,65533]` with `charCodeAt`. – Telz May 04 '15 at 16:55
1

There is a possibility that the data received is wrong because it goes through WebTCP though. – Telz May 04 '15 at 16:57
Also keep in mind the low order bit is at the top of the array. – eddyrolo May 04 '15 at 17:43
Yeah that's what I'm trying to get into the array `size`, but buffer is a string, not an array, and I can't get the raw value of the byte. – Telz May 04 '15 at 21:44

Read byte value string characters

1 Answers1