13

I'm struggling to transmit long encrypted strings over the network and get them to come out correctly on the server. For example, I have this encrypted string on the client:

wcWSERZCh8Xm1hpbNo1kSD1LvFmpuUr4wmq9hQUWeK0vYcLeFPGwFR/sBTES1A4rPV6eyp9nzEEU9uKkiFSTdP+SPOSqUf6evjf3WRHrXMRe81lIrHuRyk0iRwoNe5uIk+VlpR41kETmznXa4+gELmf53r7oayRkkffnIPDmpO+WbgE0VL3PQeOsXB01tWJyDiBIsz5WJiiEIm3ZoJW/sw==

As you can see, it has a few characters that will not transmit over the network without some URL encoding (+ and /, most notably). I'm not entirely sure if there could be other characters that could arise in other situations, so I want to make sure that my solution is 'universally' correct. I am using this line:

NSString *escapedString = [cipherString stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLHostAllowedCharacterSet]];

which I found in a highly reviewed answer.

However, I'm still having trouble decrypting this on the server side, so I printed out the results on the client immediately before sending, and I see this:

wcWSERZCh8Xm1hpbNo1kSD1LvFmpuUr4wmq9hQUWeK0vYcLeFPGwFR%2FsBTES1A4rPV6eyp9nzEEU9uKkiFSTdP+SPOSqUf6evjf3WRHrXMRe81lIrHuRyk0iRwoNe5uIk+VlpR41kETmznXa4+gELmf53r7oayRkkffnIPDmpO+WbgE0VL3PQeOsXB01tWJyDiBIsz5WJiiEIm3ZoJW%2Fsw==

Why are the '+' signs still there? Am I using the wrong allowed character set? Which character set should I use to guarantee that I correctly escape all problematic characters?

If it helps, here is the code that I am using to encrypt the plain text string. When it is done, I base64 encode the results before sending across the network:

- (NSData *)phpEncryptCleartext : (NSData *)cleartext
{
    NSData *cleartextPadded = [self phpPadData:cleartext];

    CCCryptorStatus ccStatus        = kCCSuccess;
    size_t          cryptBytes      = 0;    // Number of bytes moved to buffer.
    NSMutableData  *cipherTextData  = [NSMutableData dataWithLength:cleartextPadded.length];

    ccStatus = CCCrypt(kCCEncrypt,
                       kCCAlgorithmAES128,
                       0,
                       _sessionKey.bytes,
                       kCCKeySizeAES128,
                       _iv.bytes,
                       cleartextPadded.bytes,
                       cleartextPadded.length,
                       cipherTextData.mutableBytes,
                       cipherTextData.length,
                       &cryptBytes);

    if (ccStatus == kCCSuccess) {
        cipherTextData.length = cryptBytes;
    }
    else {
        NSLog(@"kEncryptionError code: %d", ccStatus); // Add error handling
        cipherTextData = nil;
    }

    return cipherTextData;
}

Thanks for any advice!

Community
  • 1
  • 1
AndroidDev
  • 20,466
  • 42
  • 148
  • 239
  • I haven't checked the docu but if if does not escape the `+` (which from my POV would be ok) you can fire up your own string replacement that changes plus to the equivalent ampersand symbol. – qwerty_so Apr 22 '15 at 18:45
  • Maybe you should try `URLFragmentAllowedCharacterSet`? – qwerty_so Apr 22 '15 at 18:51
  • `URLFragmentAllowedCharacterSet` created other problems. I can easily do my own string replacement, but I'm concerned about just putting out fires as they arise, since then I can't be totally confident that I have seen all of the possible problematic characters. I'm assuming that this has been dealt with many times before, and am hoping that there is a built in method that will handle it reliably. For now, though, I'll probably have to handle it as you suggest. Thanks! – AndroidDev Apr 22 '15 at 19:45
  • What do you mean by "characters that will not transmit over the network"? What protocol? There's nothing inherent in network communication that will prevent any characters from transmitting. Given that you've base64-encoded your data, that should be pretty safe in most contexts. That's the point of base64. – Ken Thomases Apr 22 '15 at 20:20
  • I tried diving into this but even getting the list of chars bound to those charsets seems to be difficult (aka impossible for me to find out). As I say: If it shall be good, do it yourself ;-) – qwerty_so Apr 22 '15 at 22:18
  • the content of a POST will transmit all data. There is not even a need to Base64 encode the data. But if you are posting as a form then the data will need to escape some characters. – zaph Apr 22 '15 at 22:32
  • 1
    You have what looks like Base64 encoded data, the characters in Base64 encoding are: A–Z, a–z, 0–9, /, + and = for padding. There may be linefeed characters depending on the Base64 encoding options. – zaph Apr 22 '15 at 22:41
  • 1
    @KenThomases - I'm using HTTP. The problem is that characters such as `+` get interpreted at the server end using their unescaped value, so a `+` gets interpreted as a space. – AndroidDev Apr 22 '15 at 22:59
  • Thanks, @Zaph. Knowing that helps a lot since I then only need to escape three characters, and the `=` seems to be received without a problem. – AndroidDev Apr 22 '15 at 23:00

1 Answers1

37

Swift version here.

To escape character use stringByAddingPercentEncodingWithAllowedCharacters:

NSString *URLEscapedString =
[string stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLQueryAllowedCharacterSet]];

The following are useful character sets, actually the characters not included in the sets:

URLFragmentAllowedCharacterSet  "#%<>[\]^`{|}
URLHostAllowedCharacterSet      "#%/<>?@\^`{|}
URLPasswordAllowedCharacterSet  "#%/:<>?@[\]^`{|}
URLPathAllowedCharacterSet      "#%;<>?[\]^`{|}
URLQueryAllowedCharacterSet     "#%<>[\]^`{|}
URLUserAllowedCharacterSet      "#%/:<>?@[\]^`

Or create your own characterset with just the characters that you need to escape.

NSCharacterSet *customCharacterset = [[NSCharacterSet characterSetWithCharactersInString:@"your characters"] invertedSet];

Creating a characterset combining all of the above:

NSCharacterSet *URLFullCharacterSet = [[NSCharacterSet characterSetWithCharactersInString:@" \"#%/:<>?@[\\]^`{|}"] invertedSet];

Creating a Base64

In the case of Base64 characterset:

NSCharacterSet *URLBase64CharacterSet = [[NSCharacterSet characterSetWithCharactersInString:@"/+=\n"] invertedSet];

Note: stringByAddingPercentEncodingWithAllowedCharacters will also encode UTF-8 characters requiring encoding.

Example to verify the characters in the set:

void characterInSet(NSCharacterSet *set) {
    NSMutableString *characters = [NSMutableString new];
    NSCharacterSet *invertedSet = set.invertedSet;
    for (int i=32; i<127; i++) {
        if ([invertedSet characterIsMember:(unichar)i]) {
            NSString *c = [[NSString alloc] initWithBytes:&i length:1 encoding:NSUTF8StringEncoding];
            [characters appendString:c];
        }
    }
    printf("characters not in set: '%s'\n", [characters UTF8String]);
}
Community
  • 1
  • 1
zaph
  • 111,848
  • 21
  • 189
  • 228
  • 1
    Excellent answer are we now supposed to encode each part of the URL differently, or is there a one size fits all approach? Also how did you determin exactly what characters each character set included? The apple docs doesn't expand and as always vague. Thanks – RyanTCB Aug 22 '15 at 10:35
  • I added creating a `NSCharacterSet` containing all of the Apple encodings. I don't remember exactly how I determined the characters, probably empirically (trying the set of all ASCII characters). – zaph Aug 22 '15 at 11:45
  • I'd like to edit your question and append this, hate it when people edit my posts, so if you OK with it let me know in comment. You can get the char set of what will be encoded by using invert: func printSet(set: NSCharacterSet) { let iSet = set.invertedSet for i: UInt32 in 32..<127 { let us = UnicodeScalar(i) let c = Character(us) if iSet.longCharacterIsMember(i) { print(String(c), " will be encoded") } } } – David H Mar 04 '16 at 22:24
  • 1
    @DavidH Added here and to [Swift version](http://stackoverflow.com/a/24552028/451475). – zaph Mar 08 '16 at 18:04
  • But the function at the end actually prints out the characters NOT in the set. And the list of characters you have for each of the character sets above are also the inverse of the actual set. – shim Jun 10 '16 at 20:15
  • Thanks, fixed the text. – zaph Jun 11 '16 at 00:57
  • Your `URLQueryAllowedCharacterSet` misses the `+` character. Check it out, it is in the set. – Mecki Aug 04 '16 at 13:16
  • Yes, `+` is in the `URLQueryAllowedCharacterSet`, the answer actually shows the characters that are not in the character sets which is what is really interesting. I have update the answer to be more clear that the characters presented are not in the sets. – zaph Aug 04 '16 at 14:32
  • Still does not excape + as it is an allowed character in all of those sets. So do I really have to create my own set? Seriously?? Hard to belive. – Torge May 03 '17 at 10:51