0

given string input:

@"bonus pay savings            2.69 F";
@"brick and mortar             0.15-B";

desired output string:

[@"bonus pay savings", @"2.69 F"];
[@"brick and mortar", @"0.15-B"];

I tried this approach:

NSString * str = @"bonus pay savings            2.69 F";
NSArray * arr = [str componentsSeparatedByString:@"   "];
NSLog(@"Array values are : %@",arr);

But the drawback of my approach is I'm using 3 spaces as a delimiter whereas the number of spaces can vary. How can this be accomplished? Thank you.

rob mayoff
  • 375,296
  • 67
  • 796
  • 848
as diu
  • 1,010
  • 15
  • 31

5 Answers5

0

A simple solution with Regular Expression.

It replaces all occurrences of 2 or more ({2,}) whitespace characters (\\s) with a random UUID string. Then it splits the string by that UUID string.

NSString *separator = [NSUUID UUID].UUIDString; 
NSString *string = @"bonus pay savings            2.69 F";
NSString *collapsedString =  [string stringByReplacingOccurrencesOfString:@"\\s{2,}"
                                                      withString:separator
                                                         options:NSRegularExpressionSearch
                                                           range:NSMakeRange(0, [string length])];
NSArray *output = [collapsedString componentsSeparatedByString:separator];
NSLog(@"%@", output);
vadian
  • 274,689
  • 30
  • 353
  • 361
0

You can use NSRegularExpression to split your string. Let's make a category on NSString:

NSString+asdiu.h

@interface NSString (asdiu)

- (NSArray<NSString *> *)componentsSeparatedByRegularExpressionPattern:(NSString *)pattern error:(NSError **)errorOut;

@end

NSString+asdiu.m

@implementation NSString (asdiu)

- (NSArray<NSString *> *)componentsSeparatedByRegularExpressionPattern:(NSString *)pattern error:(NSError **)errorOut {
    NSRegularExpression *rex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:errorOut];
    if (rex == nil) { return nil; }

    NSMutableArray<NSString *> *components = [NSMutableArray new];
    __block NSUInteger start = 0;
    [rex enumerateMatchesInString:self options:0 range:NSMakeRange(0, self.length) usingBlock:^(NSTextCheckingResult * _Nullable result, NSMatchingFlags flags, BOOL * _Nonnull stop) {
        NSRange separatorRange = result.range;
        NSRange componentRange = NSMakeRange(start, separatorRange.location - start);
        [components addObject:[self substringWithRange:componentRange]];
        start = NSMaxRange(separatorRange);
    }];
    [components addObject:[self substringFromIndex:start]];
    return components;
}

@end

You can use it like this:

NSArray<NSString *> *inputs = @[@"bonus pay savings            2.69 F", @"brick and mortar             0.15-B"];
for (NSString *input in inputs) {
    NSArray<NSString *> *fields = [input componentsSeparatedByRegularExpressionPattern:@"\\s\\s+" error:nil];
    NSLog(@"fields: %@", fields);
}

Output:

2018-06-15 13:38:13.152725-0500 test[23423:1386429] fields: (
    "bonus pay savings",
    "2.69 F"
)
2018-06-15 13:38:13.153140-0500 test[23423:1386429] fields: (
    "brick and mortar",
    "0.15-B"
)
rob mayoff
  • 375,296
  • 67
  • 796
  • 848
0

If you can assume that you only have 2 fields in the input string, I would use a limited split method like this one that always returns an array of 2 items, and then "trim" spaces off the second item using stringByTrimmingCharactersInSet.

battlmonstr
  • 5,841
  • 1
  • 23
  • 33
0

@vadian and @robmayoff have both provided good solutions based on regular expressions (REs), in both cases the REs are used to match the gaps to find where to break your string. For comparison approaching the problem the other way by using a RE to match the parts you are interested in is also possible. The RE:

\S+(\h\S+)*

will match the text you are interested in, made up as as follows:

\S          - match any non-space character, \S excludes both horizontal
              (e.g. spaces, tabs) and vertical space (e.g. newlines)
\S+         - one or more non-space characters, i.e. a "word" of sorts
\h          - a single horizontal space character (if you wish matches to
              span lines use \s - any horizontal *or* vertical space)
\h\S+       - a space followed by a word
(\h\S+)*    - zero or more space separated words
\S+(\h\S+)* - a word follow by zero or more words

With this simple regular expression you can use matchesInString:options:range: to obtain an array of NSTextCheckingResult objects, one for each match in your input; or you can use enumerateMatchesInString:options:range:usingBlock: to have a block called with each match.

As an example here is a solution following @robmayoff's approach:

@interface NSString (componentsMatchingRegularExpression)

- (NSArray<NSString *>*) componentsMatchingRegularExpression:(NSString *)pattern;

@end

@implementation NSString (componentsMatchingRegularExpression)

- (NSArray<NSString *>*) componentsMatchingRegularExpression:(NSString *)pattern
{
   NSError *errorReturn;
   NSRegularExpression *regularExpression = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:&errorReturn];

   if (!regularExpression)
      return nil;

   NSMutableArray *matches = NSMutableArray.new;
   [regularExpression enumerateMatchesInString:self
                                       options:0
                                         range:NSMakeRange(0, self.length)
                                    usingBlock:^(NSTextCheckingResult * _Nullable result, NSMatchingFlags flags, BOOL * _Nonnull stop)
                                              {
                                                 [matches addObject:[self substringWithRange:result.range]];
                                              }
   ];

   return matches.copy; // non-mutable copy
}

@end

Whether matching what you wish to keep or remove is better is subjective, take your pick.

CRD
  • 52,522
  • 5
  • 70
  • 86
0

Regular Expressions are fine for this, and the solutions given using them are perfectly fine, but just for completion you can also do this using NSScanner, which will almost always have better performance than regexes, and is pretty handy to get used to using if you need to do more complicated text parsing.

NSString *str = @"bonus pay savings            2.69 F";
NSScanner *scanner = [NSScanner scannerWithString:str];
scanner.charactersToBeSkipped = nil; // default is to ignore whitespace
while (!scanner.isAtEnd) {
    NSString *name;
    NSString *value;
    // scan up to two spaces, this would be the name
    [scanner scanUpToString:@"  " intoString:&name];

    // scan the two spaces and any extra whitespace
    [scanner scanCharactersFromSet:[NSCharacterSet whitespaceCharacterSet] intoString:nil];

    // scan to the end of the line, this is the value
    [scanner scanUpToString:@"\n" intoString:&value];
}
Nima Yousefi
  • 817
  • 6
  • 11