23

I have a URL string in the following format.

http://myserver.com/_layouts/feed.aspx?xsl=4&web=%2F&page=dda3fd10-c776-4d69-8c55-2f1c74b343e2&wp=476f174a-82df-4611-a3df-e13255d97533

I want to replace & with & in the above URL. My result should be:

http://myserver.com/_layouts/feed.aspx?xsl=4&web=%2F&page=dda3fd10-c776-4d69-8c55-2f1c74b343e2&wp=476f174a-82df-4611-a3df-e13255d97533

Can someone post me the code to get this done?

Thanks

Quinn Taylor
  • 44,553
  • 16
  • 113
  • 131
nbojja
  • 1,665
  • 7
  • 28
  • 38

4 Answers4

114

Check out my NSString category for HTML. Here are the methods available:

// Strips HTML tags & comments, removes extra whitespace and decodes HTML character entities.
- (NSString *)stringByConvertingHTMLToPlainText;

// Decode all HTML entities using GTM.
- (NSString *)stringByDecodingHTMLEntities;

// Encode all HTML entities using GTM.
- (NSString *)stringByEncodingHTMLEntities;

// Minimal unicode encoding will only cover characters from table
// A.2.2 of http://www.w3.org/TR/xhtml1/dtds.html#a_dtd_Special_characters
// which is what you want for a unicode encoded webpage.
- (NSString *)stringByEncodingHTMLEntities:(BOOL)isUnicode;

// Replace newlines with <br /> tags.
- (NSString *)stringWithNewLinesAsBRs;

// Remove newlines and white space from string.
- (NSString *)stringByRemovingNewLinesAndWhitespace;
Michael Waterfall
  • 20,497
  • 27
  • 111
  • 168
14
[urlString stringByReplacingOccurrencesOfString:@"&amp;" withString:@"&"];
Chuck
  • 234,037
  • 30
  • 302
  • 389
  • I did the same...but is there any builtin way to do this... – nbojja Jul 01 '09 at 07:22
  • 2
    @nbojja How much more built in do you want? If you're that concerned, add a method that does this as a category on NSString and then it's built in. – Abizern Jun 18 '11 at 15:16
  • 12
    @Abizern: Many languages have built-in methods to encode and decode HTML entities, Obj-C lacks this and many other things programmers take for granted since 2002. Searching and replacing is a poor substitute, because you will have to spend quite some time to know you get all the entities. – Henrik Erlandsson Feb 20 '12 at 09:56
  • Superb answer thank you :) – Supertecnoboff Mar 19 '14 at 17:50
  • @Abizern OP was likely concerned with a generalized HTML decoding solution, as just "&" is likely not the only character this may occur with. – Albert Renshaw Nov 23 '22 at 23:11
8

There is no built-in function for this in the iPhone SDK. You should file a bug that you want the functionality. In the normal Mac OS X SDK you can either load the fragment into an NSAttributedString as HTML and ask it to hand back a plain string, or use CFXMLCreateStringByUnescapingEntities().

@interface NSString (LGAdditions)
- (NSString *) stringByUnescapingEntities;
@end

@implementation NSString (LGAdditions)
- (NSString *) stringByUnescapingEntities {
  CFStringRef retvalCF = CFXMLCreateStringByUnescapingEntities(kCFAllocatorDefault, (CFStringRef)self, NULL);
  return [NSMakeCollectable(retvalCF) autorelease];
}
@end
dgatwood
  • 10,129
  • 1
  • 28
  • 49
Louis Gerbarg
  • 43,356
  • 8
  • 80
  • 90
  • This doesn't work with Automatic Reference Counting (ARC) {sigh} – mpemburn Aug 16 '12 at 16:15
  • @mpemburn did you try : ` CFStringRef retvalCF = CFXMLCreateStringByUnescapingEntities(kCFAllocatorDefault, (__bridge CFAllocatorRef)self, NULL); return (NSString *)CFBridgingRelease(retvalCF);` – Cœur Jul 19 '13 at 09:13
  • It shouldn't be bridged to CFAllocatorRef, but rather CFStringRef. That was wrong in the original code listing, too. – dgatwood Feb 13 '17 at 19:32
4

For iOS the following code should work for numeric codes. It should be relatively easy to extend to the likes of &amp; ...

-(NSString*)unescapeHtmlCodes:(NSString*)input { 

NSRange rangeOfHTMLEntity = [input rangeOfString:@"&#"];
if( NSNotFound == rangeOfHTMLEntity.location ) { 
    return input;
}


NSMutableString* answer = [[NSMutableString alloc] init];
[answer autorelease];

NSScanner* scanner = [NSScanner scannerWithString:input];
[scanner setCharactersToBeSkipped:nil]; // we want all white-space

while( ![scanner isAtEnd] ) { 

    NSString* fragment;
    [scanner scanUpToString:@"&#" intoString:&fragment];
    if( nil != fragment ) { // e.g. '&#38; B'
        [answer appendString:fragment];        
    }

    if( ![scanner isAtEnd] ) { // implicitly we scanned to the next '&#'

        int scanLocation = (int)[scanner scanLocation];
        [scanner setScanLocation:scanLocation+2]; // skip over '&#'

        int htmlCode;
        if( [scanner scanInt:&htmlCode] ) {
            char c = htmlCode;
            [answer appendFormat:@"%c", c];

            scanLocation = (int)[scanner scanLocation];
            [scanner setScanLocation:scanLocation+1]; // skip over ';'

        } else {
            // err ? 
        }
    }

}

return answer;

}

Some unit-test code ...

-(void)testUnescapeHtmlCodes {

NSString* expected = @"A & B";
NSString* actual = [self unescapeHtmlCodes:@"A &#38; B"];
STAssertTrue( [expected isEqualToString:actual], @"actual = %@", actual );

expected = @"& B";
actual = [self unescapeHtmlCodes:@"&#38; B"];    
STAssertTrue( [expected isEqualToString:actual], @"actual = %@", actual );

expected = @"A &";
actual = [self unescapeHtmlCodes:@"A &#38;"];
STAssertTrue( [expected isEqualToString:actual], @"actual = %@", actual );

}