1

I am constructing a data packet to be sent over NSStream to a server. I am trying to seperate two pieces of data with the a '§' (ascii code 167). This is the way the server is built, so I need to try to stay within those bounds...

unichar asciiChar = 167;  //yields @"§"
[self setSepString:[NSString stringWithCharacters:&asciiChar length:1]]; 

sendData=[NSString stringWithFormat:@"USER User%@Pass", sepString];

NSLog(sendData);

const uint8_t *rawString=(const uint8_t *)[sendData UTF8String];

[oStream write:rawString maxLength:[sendData length]];  

So the final outcome should look like this.. and it does when sendData is first constructed:

USER User§Pass 

however, when it is received on the server side, it looks like this:

//not a direct copy and paste. The 'mystery character' may not be exact
USER UserˤPas

...the seperator string has become two in length, and the last letter is getting cropped from the command. I believe this to be cause by the UTF8 conversion.

Can anyone shed some light on this for me?

Any help would be greatly appreciated!

Dutchie432
  • 28,798
  • 20
  • 92
  • 109
  • "'§' (ascii code 167)" There's no such thing as ascii code 167. ASCII, by definition, only defines 128 characters (i.e. only goes up to 127). Your character may be 167 in some encoding, such as Latin-1. – user102008 Apr 29 '11 at 22:08

2 Answers2

6

The correct encoding in UTF-8 for this character is the two-byte sequence 0xC2 0xA7, which is what you're getting. (Fileformat.info is invaluable for this stuff.) This is out of the LATIN-1 set, so you almost certainly want to be using NSISOLatin1StringEncoding rather than NSUTF8StringEncoding in order to get a single-byte 167 encoding. Look at NSString -dataUsingEncoding:.

Rob Napier
  • 286,113
  • 34
  • 456
  • 610
  • but [NSStream write:] specifically asks for UTF... am I missing something? – Dutchie432 Jun 16 '09 at 18:12
  • 2
    NSStream does not have a write method, I assume you mean [NSOutputStream write]. It accepts a const uint8_t*, and a length. The const uint8_t* is not a UTF-8 string, it's an array of bytes. – Jared Oberhaus Jun 16 '09 at 20:36
  • 1
    Perfect - this worked: const uint8_t *rawString=(const uint8_t *)[sendData cStringUsingEncoding:NSISOLatin1StringEncoding]; – Dutchie432 Jun 16 '09 at 20:41
  • 1
    Glad it worked. Just be careful that your username and password is and always will be 100% latin-1, and not anything else (Cyrillic, Chinese, Japanese, Korean, etc...). Also, be sure that your username and password never contain the latin-1 character 167. – Jared Oberhaus Jun 17 '09 at 20:41
1

What you have and what you want to transmit is not really a UTF-8 string, and it's technically not us-ascii, because that's only 7 bits. You want to transmit an arbitrary array of bytes, according to the protocol that you're working with. The two fields of the byte array, username and password, might themselves be UTF-8 strings, but with the 167 separator it cannot be a UTF-8 string.

Here are some options I see:

  • Construct the uint8_t* byte array using at least two different NSString objects plus the 167 code. This will be necessary if the username or password can possibly contain non-ascii characters.
  • Use the NSString method getBytes:maxLength:usedLength:encoding:options:range:remainingRange and set encoding to NSASCIIStringEncoding. If you do this you must validate elsewhere that your username and password is us-ascii only.
  • Use the NSString method getCString. However, that's been deprecated because you cannot specify the encoding you want.
Jared Oberhaus
  • 14,547
  • 4
  • 56
  • 55