4

I'm trying to have my NSScanner attempt to scan the following regexp: [a-zA-Z_][a-zA-Z0-9_]*, but am having difficulty.

I can try to read a-Z_ first, then try to append a-Z0-9_.

I'm wondering if there is an easier / more efficient way of doing this. Please let me know, thanks.


Clarification: I'm not trying to execute a regular expression. I'm just trying to read a string that looks like the above regexp. Something that looks similar to C-style variables. Basically, any alphanumeric word, but must not start with a number.


Clarification 2: I'm trying to have the scanner read ([] indicate each read token): "test 3" as [test, 3] "test3" as [test3] "3test" as [3, test] "_3test" as [_3test] "_3 test" as [3, test] " 3 3test" as [, 3, 3, test] " 3 test3" as [_, 3, test3]

AWF4vk
  • 5,810
  • 3
  • 37
  • 70
  • Please post your code. Also, why are you trying to scan a regular expression string? There may be a better way to do what you're trying to do. – Rob Keniger Jul 06 '11 at 05:41
  • Are you trying to parse a regular expression, or do the same thing the regular expression would do? – Peter Hosey Jul 06 '11 at 08:02
  • @Peter: I'm just trying to accomplish scanning (same thing the regular expression would do); figured the regular expression would explain what I'm trying to scan better. – AWF4vk Jul 06 '11 at 12:33

2 Answers2

3

You'll need two character sets:

Using these will enable you to match all letters and numbers in Unicode, not just the English alphabet and digit set. If you really do want only those much-smaller sets, they're easy enough to construct using characterSetWithCharactersInString: and/or characterSetWithRange:. If you use the latter method, you'll need to make an NSMutableCharacterSet and union another character set into it.

Once you have your character set, it's easy to scan characters only characters within a set and then, if you want, concatenate one string onto the other.

Peter Hosey
  • 95,783
  • 15
  • 211
  • 370
  • This won't solve my problem. Take, for instance the following string: "this 3 will fail". According to your method there is no way to determine whether "this 3" is equal to "this3". I will chomp on the space without knowing whether I'll be getting an integer, and whether it is separated by a space (or other skipped character). – AWF4vk Jul 06 '11 at 15:25
  • @David: That has nothing to do with the question you asked, which did not include any spaces in the regular expression. Nonetheless, you can do that with NSScanner, too, with different character sets. Don't forget to set the scanner's skip character set to the empty character set if you don't want it to automatically eat spaces. – Peter Hosey Jul 06 '11 at 16:51
  • The point is I want it to function like that regular expression. I do not want it to skip white space, but I also don't want it to read "test 3" if there's a space between the "test" and "3". However, with your explanation, it will do just that. – AWF4vk Jul 06 '11 at 17:13
1

It depends what you want to do. If you want use regexp, i heard about this framework : RegexKit

You can apply regex more easily trough your strings, array, dict, etc...

jlink
  • 682
  • 7
  • 24