1

I would like to check if a NSString contains each digit possible (0-9) more then 5 times. I do not need to know which digit or how many times, I simply want it to return TRUE or False for whether any of the digits are contained more then 5 times in the string. I would like it to be as efficient as possible.

I have given it some thought and the long way of going about it would be to place all 10 digits (again 0-9) in an array and then loop through each digit comparing it to the string. If there are more than 5 matches within the string, place a flag that will return true.

Can anyone tell me if there is a "better" or more efficient way of going about this problem?

Thank you!

Teddy13
  • 3,824
  • 11
  • 42
  • 69
  • Are you allowed to have for example 4444123 (more than 5 digits but no digit appears more than 5 times) or is it simply not more than 5 digits in the string? – David Rönnqvist Aug 14 '13 at 10:28
  • @DavidRönnqvist Sorry for not clarifying David. The string can be unlimited length but no digit should appear more than 5 times. Thanks! – Teddy13 Aug 14 '13 at 10:30
  • please frefer http://stackoverflow.com/questions/4663438/objective-c-find-numbers-in-string – Gobi M Aug 14 '13 at 11:12

4 Answers4

3

This may not be the "best" way of doing things but it was the most fun way of doing it for me and it takes quite good advantage of Foundation using characters sets, counted sets and block based string enumeration.

// Your string
NSString *myString = @"he11o 12345 th1s 55 1s 5 very fun 55 1ndeed.";

// A set of all numeric characters
NSCharacterSet *numbers = [NSCharacterSet characterSetWithCharactersInString:@"0123456789"];
NSUInteger digitThreshold = 5;

// An emtpy counted set
NSCountedSet *numberOccurances = [NSCountedSet new];

// Loop over all the substrings as composed characters
// this will not have the same issues with e.g. Chinese characters as
// using a C string would. (Available since iOS 4)
[myString enumerateSubstringsInRange:NSMakeRange(0, myString.length)
                             options:NSStringEnumerationByComposedCharacterSequences
                          usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
                              // Check if substring is part of numeric set of characters
                              if ([substring rangeOfCharacterFromSet:numbers].location != NSNotFound) {
                                  [numberOccurances addObject:substring];
                                  // Check if that number has occurred more than 5 times
                                  if ([numberOccurances countForObject:substring] > digitThreshold) {
                                      *stop = YES;
                                      // Do something here based on that fact
                                      NSLog(@"%@ occured more than %d times", substring, digitThreshold);
                                  }
                              }
                          }];

If you don't let it stop then it will continue to count the number of occurrences for all the digits in that string.

If you log the counted set is looks like this (number within square brackets are count):

<NSCountedSet: 0xa18d830> (3 [1], 1 [6], 4 [1], 2 [1], 5 [6])
David Rönnqvist
  • 56,267
  • 18
  • 167
  • 205
2

This code tries to be as performant as possible.

BOOL checkDigits(NSString *string)
{
    // get the raw UTF-16 code fragments, hopefully without a copy
    const UniChar *characters = CFStringGetCharactersPtr((__bridge CFStringRef)string);
    NSData *characterData = nil;
    if (characters == NULL) {
        characterData = [string dataUsingEncoding:NSUTF16StringEncoding];
        characters = [characterData bytes];
    }

    // initialize 10 individual counters for digits
    int digitCount[10] = {};
    NSUInteger length = [string length];

    // loop over the characters once
    for (NSUInteger i = 0; i != length; ++i) {
        UniChar c = characters[i];

        // UTF-16 encodes ASCII digits as their values
        if (c >= '0' && c <= '9') {
            int idx = c - '0';
            if (digitCount[idx] == 4)
                return YES;
            digitCount[idx] += 1;
        }
    }

    // keep the NSData object alive until here
    [characterData self];

    return NO;
}
Nikolai Ruhe
  • 81,520
  • 17
  • 180
  • 200
  • 1
    @RobvanderVeer I understand his algorithm differently: He's iterating over the string for each possible digit while this proposal has ten counters for each digit and traverses the string only once. – Nikolai Ruhe Aug 14 '13 at 10:57
  • @RobvanderVeer You are correct, I had something similar in mind but the code works very well and I guess you cannot go wrong going back to basics. Also based on the other answers, it seems this is as efficient as it gets :). Thanks Nikolai! – Teddy13 Aug 14 '13 at 10:57
  • This will get the right answer most of the time, but will start failing if you use more exotic unicode characters in your string. Code points != characters. I can provide a (much slower) answer that does the right thing if you're interested, although as I said, @NikolaiRuhe 's answer will be correct most of the time. – robbie_c Aug 14 '13 at 11:20
  • @robbie_c Can you provide an example of where the code fails? I was under the impression that UTF-16 code fragments distinctively identify ASCII characters. Or are you talking about non-ASCII digits? – Nikolai Ruhe Aug 14 '13 at 11:25
  • @robbie_c Thanks robbie but I am able to clean the string prior to running this code with some simple regex magic hence that situation should never happen for me. I am more interested in performance. Thank you though! – Teddy13 Aug 14 '13 at 11:29
  • @Teddy13 It seems that if performance is your concern you should rather skip the regex cleaning and use a proper digit-counting-algorithm instead. – Nikolai Ruhe Aug 14 '13 at 11:31
  • 1
    @NikolaiRuhe does emoji and e.g. Chinese character work in UTF16? (It's an honest question) – David Rönnqvist Aug 14 '13 at 11:42
  • 1
    @DavidRönnqvist Yes. Emoji do not belong to the "BMP" (code points U+0000 to U+FFFF) and are encoded using surrogate pairs. The resulting UTF-16 code fragments are in the ranges 0xD800 to0xDBFF and 0xDC00 to 0xDFFF. These ranges do not clash with ASCII so above code should not be affected. – Nikolai Ruhe Aug 14 '13 at 11:47
  • OP never specified that the string was restricted to ACSII. You are right that the BMP (and therefore ASCII) characters can be represented as one UTF-16, and that the 2-byte parts of the surrogate pairs outside this will start with (IIRC) b110110 and b110111 and so not clash with the ASCII representations. I was thinking more along the lines of compositional characters. – robbie_c Aug 14 '13 at 11:59
  • @robbie_c The string is not restricted to ASCII (yet the digits are). What kinds of compositional characters would cause trouble? – Nikolai Ruhe Aug 14 '13 at 12:00
  • Hmm I retract my initial comment :) – robbie_c Aug 15 '13 at 09:02
0

Use following code to check string contain 0-9 :

NSUInteger count = 0, length = [yourString length];
    NSRange range = NSMakeRange(0, length); 
    while(range.location != NSNotFound)
   {
     range = [yourString rangeOfString: @"hello" options:0 range:range];
     if(range.location != NSNotFound)
     {
        range = NSMakeRange(range.location + range.length, length - (range.location + range.length));
       count++; 
    }
  }
Divya Bhaloidiya
  • 5,018
  • 2
  • 25
  • 45
  • 3
    Won't that match 6 times for the string? "123456" even though each digit only appears once? – David Rönnqvist Aug 14 '13 at 10:35
  • Thanks Divya, but I believe @DavidRönnqvist is right. I am not looking for the amount of digits in a string but whether any of the digits appear more than 5 times in that string – Teddy13 Aug 14 '13 at 10:37
  • @DivyaBhalodiya You're still misunderstanding the question. It's about counting individual digits, not numbers. – Nikolai Ruhe Aug 14 '13 at 10:51
-1
NSString *materialnumber =[[self.documentItemsArray objectAtIndex:indexPath.row] getMATERIAL_NO];
            NSPredicate *predicate = [NSPredicate predicateWithFormat:@"SELF MATCHES '[0-9*]+'"];
            if ([predicate evaluateWithObject:materialnumber])
            {
                materialnumber = [NSString stringWithFormat:@"%d",[materialnumber intValue]];
            }
NHS
  • 409
  • 2
  • 7
  • Thanks for the quick reply! I am not familiar with NSPredicate class but I will look into it to learn more. Can you tell me if "SELF MATCHES' is supposed to be that way? And I assume materialnumber is the string that I am checking against? Thanks so much! – Teddy13 Aug 14 '13 at 10:26
  • 3
    I don't think that regex does what the OP is asking for. Wouldn't that match for 123456, even though that is fine in the OPs case? – David Rönnqvist Aug 14 '13 at 10:32
  • I have tested the code and @DavidRönnqvist is correct. I am looking to see if any digit is contained more then 5 times within a string, not the total amount of digits within a string. Thank you for your attempt NHS, much appreciated – Teddy13 Aug 14 '13 at 10:40