4

To begin, I am writing an iOS 5 app. By way of example, say that I have the following string:

100 - PARK STREET / JAMES PLACE

I would like to extract the two road names from this string in the most efficient (and code-elegant) way possible. I have tried combinations of using [string componentsSeparatedByString...] etc. but this gets very messy, very quickly. Additionally, it requires a large amount of conditional statements to handle a situation such as the following:

100 - BI-CENTENNIAL DRIVE / JAMES PLACE

since that contains a nested hyphen which would be split if we were using [string componentsSeparatedByString:@"-"] and require reassembly.

There are also situations where the string may have a slightly different format, such as:

100- BI-CENTENNIAL DRIVE / JAMES PLACE

(lack of a space between the number and hyphen)

100-BI-CENTENNIAL DRIVE /JAMES PLACE

(lack of any spaces surrounding the number at all, combined with no space between the slash and the second road name)

However, we can always assume that there will only be one slash in the string which separates the two road names.

The road names should also be stripped of any leading and trailing spaces.

I figured that this entire process would be possible in a more efficient and elegant manner using an NSScanner but unfortunately I don't have the necessary experience with this class to make it work. Any suggestions would be greatly appreciated.

Skoota
  • 5,280
  • 9
  • 52
  • 75
  • Have you read the documentation on NSScanner? There is some great example code here: https://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/Strings/Articles/Scanners.html#//apple_ref/doc/uid/20000147-BCIEFGHC – sosborn Apr 20 '12 at 08:36
  • Yes, but I still cannot get my head around how the NSScanner operates, particularly to split a string in this way. In particular, I would really like to see how this can be done efficiently by someone who has a (much) better understanding of NSScanners than me. – Skoota Apr 20 '12 at 08:41
  • With the scanner you aren't splitting a string, you are going through it one character at time and reacting accordingly. Ken's example is perfect. I suggest trying it out just to get a feel for how it works. For me it is an easier solution than REGEX but they both work fine. – sosborn Apr 20 '12 at 11:58

4 Answers4

3

You could also use Regular Expression.

Note that in the block I use capture blocks, via [result rangeAtIndex:i].
Index 1 will now be the house number, index 2 will return the first street and 3 the second street.

#import <Foundation/Foundation.h>

int main (int argc, const char * argv[])
{

    @autoreleasepool {
        NSArray *streets = [NSArray arrayWithObjects:@"100 - PARK STREET / JAMES PLACE", @"100 - BI-CENTENNIAL DRIVE / JAMES PLACE", @"100- BI-CENTENNIAL DRIVE / JAMES PLACE", @"100-BI-CENTENNIAL DRIVE /JAMES PLACE", nil];

        NSString *text = [streets componentsJoinedByString:@" "];
        NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(\\d+) {0,1}- {0,1}(\\D+) *\\/ *(\\D+)" options:NSRegularExpressionCaseInsensitive error:nil];

        [regex enumerateMatchesInString:text options:0 
                                  range:NSMakeRange(0, [text length]) 
                             usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) 
        {
            for (int i = 1; i< [result numberOfRanges] ; i++) {
                NSLog(@"%@", [text substringWithRange:[result rangeAtIndex:i]]);
            }
        }];
    }
    return 0;
}

output:

100
PARK STREET 
JAMES PLACE 
100
BI-CENTENNIAL DRIVE 
JAMES PLACE 
100
BI-CENTENNIAL DRIVE 
JAMES PLACE 
100
BI-CENTENNIAL DRIVE 
JAMES PLACE

edit in response to the comments

int main (int argc, const char * argv[])
{

    @autoreleasepool {
        NSArray *streets = [NSArray arrayWithObjects:@"100 - PARK STREET / JAMES PLACE", @"100 - BI-CENTENNIAL DRIVE / JAMES PLACE", @"100- BI-CENTENNIAL DRIVE / JAMES PLACE", @"100-BI-CENTENNIAL DRIVE /JAMES PLACE",@"100 - PARK STREET", nil];

        NSRegularExpression *regex1 = [NSRegularExpression regularExpressionWithPattern:@"(\\d+) *- *([^\\/]+) *$" options:NSRegularExpressionCaseInsensitive error:nil];
        NSRegularExpression *regex2 = [NSRegularExpression regularExpressionWithPattern:@"(\\d+) *- *([^\\/]+) *\\/ *([^\\/]+) *$" options:NSRegularExpressionCaseInsensitive error:nil];
        for (NSString *text in streets) {                        
            NSRegularExpression *regex = ([regex1 numberOfMatchesInString:text options:NSRegularExpressionCaseInsensitive range:NSMakeRange(0, [text length])]) ? regex1 : regex2;
            [regex enumerateMatchesInString:text options:0 
                                      range:NSMakeRange(0, [text length]) 
                                 usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) 
             {
                 for (int i = 1; i< [result numberOfRanges] ; i++) {
                     NSLog(@"%@", [text substringWithRange:[result rangeAtIndex:i]]);
                 }

             }];
        }
    }
    return 0;
}

second edit

int main (int argc, const char * argv[])
{

    @autoreleasepool {
        NSArray *streets = [NSArray arrayWithObjects:   @"100 - PARK STREET / JAMES PLACE", 
                                                        @"100 - BI-CENTENNIAL DRIVE / JAMES PLACE", 
                                                        @"100- BI-CENTENNIAL DRIVE / JAMES PLACE", 
                                                        @"100-BI-CENTENNIAL DRIVE /JAMES PLACE",
                                                        @"100 - PARK STREET",
                                                        @"100 - PARK STREET / ",
                                                        @"100 - PARK STREET/ ",
                                                        @"100 - PARK STREET/",
                            nil];

        NSRegularExpression *regex1 = [NSRegularExpression regularExpressionWithPattern:@"(\\d+) *- *([^\\/]+) *$" options:NSRegularExpressionCaseInsensitive error:nil];
        NSRegularExpression *regex2 = [NSRegularExpression regularExpressionWithPattern:@"(\\d+) *- *([^\\/]+) *\\/ *([^\\/]*) *$" options:NSRegularExpressionCaseInsensitive error:nil];
        for (NSString *text in streets) { 

            text= [text stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
            NSLog(@"\n>%@<", text);
            NSRegularExpression *regex = ([regex1 numberOfMatchesInString:text options:NSRegularExpressionCaseInsensitive range:NSMakeRange(0, [text length])]) ? regex1 : regex2;
            [regex enumerateMatchesInString:text options:0 
                                      range:NSMakeRange(0, [text length]) 
                                 usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) 
             {
                 for (int i = 1; i< [result numberOfRanges] ; i++) {
                     NSLog(@"%@", [text substringWithRange:[result rangeAtIndex:i]]);
                 }

             }];
        }
    }
    return 0;
}
vikingosegundo
  • 52,040
  • 14
  • 137
  • 178
  • Thanks, this works perfectly :) Didn't consider using a Regular Expression instead of a NSScanner. – Skoota Apr 20 '12 at 10:10
  • One other edge case has appeared. How could the regex be modified to cope with a string like this: `100 - PARK STREET`. In other words, no slash at all - just a single street? – Skoota Apr 20 '12 at 10:20
  • My first advice when it comes to regex would be "consider, if it is really the tool for the job" (see: http://regex.info/blog/2006-09-15/247), but your case seems to such a case. – vikingosegundo Apr 20 '12 at 10:20
  • Something along the lines of this? `(\\d+) {0,1}- {0,1}(\\D+)` – Skoota Apr 20 '12 at 10:21
  • What would be your recommended way of using the two regex? Running the first (to check for two streets) and if nothing is returned then run the second (to check for one street only)? – Skoota Apr 20 '12 at 10:24
  • it depends on various variables. is it one string? does a line contain several addresses? or is it a array of lines? – vikingosegundo Apr 20 '12 at 10:27
  • It's one single string, that contains one single address. For example: `NSString *address = @"100 - PARK STREET / JAMES PLACE"` or `NSString *address = @"100 - PARK STREET"` (for the single street variant). – Skoota Apr 20 '12 at 10:30
  • Thanks - that works great. Is there a way to prevent it from returning a blank third element in the situation of `NSString *address = @"100 - PARK STREET / "` or `NSString *address = @"100 - PARK STREET/ "` and also still make it work for `NSString *address = @"100 - PARK STREET/"` (at the moment that returns nothing)? – Skoota Apr 20 '12 at 11:20
  • By the way, thanks for your continued help. I always struggle with regex so this is teaching me a fair bit :) – Skoota Apr 20 '12 at 11:20
  • 1
    i guess only masochists do not struggle with regex :). check my new edit, on how I would try to normalize a bit. also the second regex now allows the last capture group to be empty, * instead of + – vikingosegundo Apr 20 '12 at 11:38
  • Thanks again :) This works great and I think that pretty much exhausts all my use-cases! Up vote for you. – Skoota Apr 20 '12 at 11:52
1

Just coded in my browser:

NSString* line = @"100- BI-CENTENNIAL DRIVE / JAMES PLACE";
NSScanner* scanner = [NSScanner scannerWithString:line];
NSString* number;
if (![scanner scanUpToString:@"-" intoString:&number])
    /* handle parse failure */;
NSString* firstRoad;
if (![scanner scanUpToString:@"/" intoString:&firstRoad])
    /* handle parse failure */;
NSString* secondRoad = [str substringFromIndex:[scanner scanLocation]];

There may be additional whitespace to trim from the resulting strings.

Ken Thomases
  • 88,520
  • 7
  • 116
  • 154
0

This looks like a job for NSRegularExpression.

I think an R.E. something like

^[0-9]+ *- *(.*)$

will match what you want.

JeremyP
  • 84,577
  • 15
  • 123
  • 161
-1

Here's another example of using this horrible little NSScanner class.

Supposing you had a string, containing four values, and wanted to convert them into a CGRect:

NSString* stringToParse = @"10, 20, 600, 150";             
CGRect rect = [self stringToCGRect:stringToParse];

NSLog(@"Rectangle: %.0f, %.0f, %.0f, %.0f", rect.origin.x, rect.origin.y, rect.size.width, rect.size.height);

To do this, you'd write a nasty little function like this:

-(CGRect)stringToCGRect:(NSString*)stringToParse
{
    NSLog(@"Parsing the string: %@", stringToParse);
    int x, y, wid, hei;

    NSString *subString;
    NSScanner *scanner = [NSScanner scannerWithString:stringToParse];
    [scanner scanUpToCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet] intoString:nil];
    [scanner scanCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet] intoString:&subString];
    x = [subString integerValue];

    [scanner scanUpToCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet] intoString:nil];
    [scanner scanCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet] intoString:&subString];
    y = [subString integerValue];

    [scanner scanUpToCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet] intoString:nil];
    [scanner scanCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet] intoString:&subString];
    wid = [subString integerValue];

    [scanner scanUpToCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet] intoString:nil];
    [scanner scanCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet] intoString:&subString];
    hei = [subString integerValue];

    CGRect rect = CGRectMake(x, y, wid, hei);
    return rect;
}

Forgive my negativity, but I'm tired, it's 10.30pm at night, and I despise having to write Objective-C code like this, knowing full well that using any Microsoft development environment from the past 15 years, this would've taken one line of code.

Grrrr....

Mike Gledhill
  • 27,846
  • 7
  • 149
  • 159