74

I would like to get the percent encoded string for these specific letters, how to do that in objective-c?

Reserved characters after percent-encoding
!   *   '   (   )   ;   :   @   &   =   +   $   ,   /   ?   #   [   ]
%21 %2A %27 %28 %29 %3B %3A %40 %26 %3D %2B %24 %2C %2F %3F %23 %5B %5D

Percent-encoding wiki

Please test with this string and see if it do work:

myURL = @"someurl/somecontent"

I would like the string to look like:

myEncodedURL = @"someurl%2Fsomecontent"

I tried with the stringByAddingPercentEscapesUsingEncoding: NSASCIIStringEncoding already but it does not work, the result is still the same as the original string. Please advice.

Hoang Pham
  • 6,899
  • 11
  • 57
  • 70

8 Answers8

145

I've found that both stringByAddingPercentEscapesUsingEncoding: and CFURLCreateStringByAddingPercentEscapes() are inadequate. The NSString method misses quite a few characters, and the CF function only lets you say which (specific) characters you want to escape. The proper specification is to escape all characters except a small set.

To fix this, I created an NSString category method to properly encode a string. It will percent encoding everything EXCEPT [a-zA-Z0-9.-_~] and will also encode spaces as + (according to this specification). It will also properly handle encoding unicode characters.

- (NSString *) URLEncodedString_ch {
    NSMutableString * output = [NSMutableString string];
    const unsigned char * source = (const unsigned char *)[self UTF8String];
    int sourceLen = strlen((const char *)source);
    for (int i = 0; i < sourceLen; ++i) {
        const unsigned char thisChar = source[i];
        if (thisChar == ' '){
            [output appendString:@"+"];
        } else if (thisChar == '.' || thisChar == '-' || thisChar == '_' || thisChar == '~' || 
                   (thisChar >= 'a' && thisChar <= 'z') ||
                   (thisChar >= 'A' && thisChar <= 'Z') ||
                   (thisChar >= '0' && thisChar <= '9')) {
            [output appendFormat:@"%c", thisChar];
        } else {
            [output appendFormat:@"%%%02X", thisChar];
        }
    }
    return output;
}
Community
  • 1
  • 1
Dave DeLong
  • 242,470
  • 58
  • 448
  • 498
  • 7
    Note that encoding spaces as + instead %20 is according to x-www-form-urlencoded, not to oauth-1,3.6 (the spec link you posted). – Jano Aug 02 '11 at 12:28
  • 6
    @mihir of course. This is just a conversion method. If you don't want an entire string encoded, don't pass in the entire string... – Dave DeLong Sep 03 '11 at 15:27
  • 3
    Ya got your point, actually i have implemented everything and than realize this issue, need to encode every parameter individually now ... very cumbersome ... :( – Mihir Mehta Sep 07 '11 at 06:36
  • As @Jano points out, + encoding spaces is not ideal here and is reserved for the query portion of the URL. See http://stackoverflow.com/questions/2678551/when-to-encode-space-to-plus-and-when-to-20 – levigroker Aug 17 '12 at 21:48
  • @zaitsman the license for all content on this site is in the footer: cc-wiki with attribution required. – Dave DeLong May 19 '13 at 21:12
  • I'm getting mixed messages from stack overflow, is `CFURLCreateStringByAddingPercentEscapes` valid or not? I've heard that `NSString *encodedString = (__bridge_transfer NSString *)CFURLCreateStringByAddingPercentEscapes(NULL, (__bridge CFStringRef)originalString, NULL, (CFStringRef)@":!*();@/&?#[]+$,='%’\"", kCFStringEncodingUTF8);` does indeed work because you specify the characters to encode. – Awesome-o Feb 25 '14 at 22:35
  • @Dave DeLong: Why are you casting UTF8 representation of the `self` to the `unsigned` type? Why is `const char *` not enough (or not good)? – Aleksa Apr 20 '14 at 18:03
  • ```const unsigned char * source = (const unsigned char *)[self UTF8String];``` It is not clear. "self" is the source string you wanna encode – Yuchao Zhou Jun 01 '16 at 15:09
  • @mihirmehta Of course it includes "http://" in the escaping. That's exactly what it should do. This is meant for fully escaping the input so it can be passed in a url, for example passing a url in a query parameter. – devios1 May 18 '18 at 20:09
  • Just realized that comment is almost 7 years old. He's probably not going to get the message, and has (hopefully) learned long ago the folly of his previous ways. – devios1 May 18 '18 at 20:15
107

The iOS 7 SDK now has a better alternative tostringByAddingPercentEscapesUsingEncoding that does let you specify that you want all characters escaped except certain allowed ones. It works well if you are building up the URL in parts:

NSString * unescapedQuery = [[NSString alloc] initWithFormat:@"?myparam=%d", numericParamValue];
NSString * escapedQuery = [unescapedQuery stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLQueryAllowedCharacterSet]];
NSString * urlString = [[NSString alloc] initWithFormat:@"http://ExampleOnly.com/path.ext%@", escapedQuery];

Although it's less often that the other parts of the URL will be variables, there are constants in the NSURLUtilities category for those as well:

[NSCharacterSet URLHostAllowedCharacterSet]
[NSCharacterSet URLUserAllowedCharacterSet]
[NSCharacterSet URLPasswordAllowedCharacterSet]
[NSCharacterSet URLPathAllowedCharacterSet]
[NSCharacterSet URLFragmentAllowedCharacterSet]

[NSCharacterSet URLQueryAllowedCharacterSet] includes all of the characters allowed in the query part of the URL (the part starting with the ? and before the # for a fragment, if any) including the ? and the & or = characters, which are used to delimit the parameter names and values. For query parameters with alphanumeric values, any of those characters might be included in the values of the variables used to build the query string. In that case, each part of the query string needs to be escaped, which takes just a bit more work:

NSMutableCharacterSet * URLQueryPartAllowedCharacterSet; // possibly defined in class extension ...

// ... and built in init or on first use
URLQueryPartAllowedCharacterSet = [[NSCharacterSet URLQueryAllowedCharacterSet] mutableCopy];
[URLQueryPartAllowedCharacterSet removeCharactersInString:@"&+=?"]; // %26, %3D, %3F

// then escape variables in the URL, such as values in the query and any fragment:
NSString * escapedValue = [anUnescapedValue stringByAddingPercentEncodingWithAllowedCharacters:URLQueryPartAllowedCharacterSet];
NSString * escapedFrag = [anUnescapedFrag stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLFragmentAllowedCharacterSet]];
NSString * urlString = [[NSString alloc] initWithFormat:@"http://ExampleOnly.com/path.ext?myparam=%@#%@", escapedValue, escapedFrag];
NSURL * url = [[NSURL alloc] initWithString:urlString];

The unescapedValue could even be an entire URL, such as for a callback or redirect:

NSString * escapedCallbackParamValue = [anAlreadyEscapedCallbackURL stringByAddingPercentEncodingWithAllowedCharacters:URLQueryPartAllowedCharacterSet];
NSURL * callbackURL = [[NSURL alloc] initWithString:[[NSString alloc] initWithFormat:@"http://ExampleOnly.com/path.ext?callback=%@", escapedCallbackParamValue]];

Note: Don't use NSURL initWithScheme:(NSString *)scheme host:(NSString *)host path:(NSString *)path for a URL with a query string because it will add more percent escapes to the path.

Chris Nolet
  • 8,714
  • 7
  • 67
  • 92
Rob at TVSeries.com
  • 2,397
  • 1
  • 21
  • 17
  • This should be the leading answer. It's up to date with iOS7 utilities, and it correctly notes that different parts of the URL should be escaped differently. – algal Apr 16 '14 at 19:54
  • 1
    There is just the '+' missing so you might want to add `[URLQueryPartAllowedCharacterSet removeCharactersInRange:NSMakeRange('+', 1)];` but otherwise this code is perfect for escaping strings, thank you ! – Quentin G. Apr 18 '14 at 16:59
  • 9
    There's a less wordy alternative: `URLQueryPartAllowedCharacterSet = [[NSCharacterSet URLQueryAllowedCharacterSet] mutableCopy]; [URLQueryPartAllowedCharacterSet removeCharactersInString:@"?&=@+/'"];` I also tested 100,000 iterations on a Mac and found this consistently slightly faster than calling `removeCharactersInRange:` multiple times. – robotspacer Jul 10 '14 at 21:59
  • how about the `%`? shouldn't it be encoded as well if it is in the value? – njzk2 Nov 12 '15 at 16:41
5

NSString's stringByAddingPercentEscapesUsingEncoding: looks like what you're after.

EDIT: Here's an example using CFURLCreateStringByAddingPercentEscapes instead. originalString can be either an NSString or a CFStringRef.

CFStringRef newString = CFURLCreateStringByAddingPercentEscapes(kCFAllocatorDefault, originalString, NULL, CFSTR("!*'();:@&=+@,/?#[]"), kCFStringEncodingUTF8);

Please note that this is untested. You should have a look at the documentation page to make sure you understand the memory allocation semantics for CFStringRef, the idea of toll-free bridging, and so on.

Also, I don't know (off the top of my head) which of the characters specified in the legalURLCharactersToBeEscaped argument would have been escaped anyway (due to being illegal in URLs). You may want to check this, although it's perhaps better just to be on the safe side and directly specify the characters you want escaped.

I'm making this answer a community wiki so that people with more knowledge about CoreFoundation can make improvements.

Jano
  • 62,815
  • 21
  • 164
  • 192
David
  • 2,821
  • 20
  • 16
  • Which NSStringEcoding do I have to use to make all those characters above work correctly? Have you tried with a string? – Hoang Pham Aug 06 '10 at 12:16
  • Hmm, there doesn't seem to be an `NSStringEncoding` value for what you want. You could try `CFURLCreateStringByAddingPercentEscapes` (http://developer.apple.com/mac/library/documentation/CoreFoundation/Reference/CFURLRef/Reference/reference.html#//apple_ref/c/func/CFURLCreateStringByAddingPercentEscapes) instead - it lets you directly specify the characters to escape. – David Aug 06 '10 at 12:29
  • ...oh, and by the way: `NSString` and `CFStringRef` are "toll-free bridged", meaning that they can be passed interchangeably to each other's functions. – David Aug 06 '10 at 12:30
5
NSString *encodedString = [myString stringByAddingPercentEscapesUsingEncoding:NSASCIIStringEncoding];

It won't replace your string inline; it'll return a new string. That's implied by the fact that the method starts with the word "string". It's a convenience method to instantiate a new instance of NSString based on the current NSString.

Note--that new string will be autorelease'd, so don't call release on it when you're done with it.

Dan Ray
  • 21,623
  • 6
  • 63
  • 87
5

Following the RFC3986 standard, here is what I'm using for encoding URL components:

// https://tools.ietf.org/html/rfc3986#section-2.2
let rfc3986Reserved = NSCharacterSet(charactersInString: "!*'();:@&=+$,/?#[]")
let encoded = "email+with+plus@example.com".stringByAddingPercentEncodingWithAllowedCharacters(rfc3986Reserved.invertedSet)

Output: email%2Bwith%2Bplus%40example.com

Eneko Alonso
  • 18,884
  • 9
  • 62
  • 84
  • HI, Alonso, I just encoding for a email address. How can i do it? For example, My mail address is m+2@qq.com, If i encode @, our server will report error. Can you hel me about this? Thanks – mmm2006 Jan 14 '19 at 03:40
2

If you are using ASI HttpRequest library in your objective-c program, which I cannot recommend highly enough, then you can use the "encodeURL" helper API on its ASIFormDataRequest object. Unfortunately, the API is not static so maybe worth creating an extension using its implementation in your project.

The code, copied straight from the ASIFormDataRequest.m for encodeURL implementation, is:

- (NSString*)encodeURL:(NSString *)string
{
    NSString *newString = NSMakeCollectable([(NSString *)CFURLCreateStringByAddingPercentEscapes(kCFAllocatorDefault, (CFStringRef)string, NULL, CFSTR(":/?#[]@!$ &'()*+,;=\"<>%{}|\\^~`"), CFStringConvertNSStringEncodingToEncoding([self stringEncoding])) autorelease]);
    if (newString) {
        return newString;
    }
    return @"";
}

As you can see, it is essentially a wrapper around CFURLCreateStringByAddingPercentEscapes that takes care of all the characters that should be properly escaped.

bhavinb
  • 3,278
  • 2
  • 28
  • 26
0

Before I noticed Rob's answer, which appears to work well and is preferred as it's cleaner, I went ahead and ported Dave's answer to Swift. I'll leave it here in case anyone is interested:

public extension String {

    // For performance, I've replaced the char constants with integers, as char constants don't work in Swift.

    var URLEncodedValue: String {
        let output = NSMutableString()
        guard let source = self.cStringUsingEncoding(NSUTF8StringEncoding) else {
            return self
        }
        let sourceLen = source.count

        var i = 0
        while i < sourceLen - 1 {
            let thisChar = source[i]
            if thisChar == 32 {
                output.appendString("+")
            } else if thisChar == 46 || thisChar == 45 || thisChar == 95 || thisChar == 126 ||
                (thisChar >= 97 && thisChar <= 122) ||
                (thisChar >= 65 && thisChar <= 90) ||
                (thisChar >= 48 && thisChar <= 57) {
                    output.appendFormat("%c", thisChar)
            } else {
                output.appendFormat("%%%02X", thisChar)
            }

            i++
        }

        return output as String
    }
}
Ben Baron
  • 14,496
  • 12
  • 55
  • 65
0

In Swift4:

 var str = "someurl/somecontent"

 let percentEncodedString = str.addingPercentEncoding(withAllowedCharacters: .alphanumerics)
lebelinoz
  • 4,890
  • 10
  • 33
  • 56
garg
  • 2,651
  • 1
  • 24
  • 21