6

I am making an app in Swift and I need to catch 8 numbers from a string. Here's the string: index.php?page=index&l=99182677

My pattern is: &l=(\d{8,})

And here's my code:

var yourAccountNumber = "index.php?page=index&l=99182677"
let regex = try! NSRegularExpression(pattern: "&l=(\\d{8,})", options: NSRegularExpressionOptions.CaseInsensitive)
let range = NSMakeRange(0, yourAccountNumber.characters.count)
let match = regex.matchesInString(yourAccountNumber, options: NSMatchingOptions.Anchored, range: range)

Firstly, I don't know what the NSMatchingOptions means, on the official Apple library, I don't get all the .Anchored, .ReportProgress, etc stuff. Anyone would be able to lighten me up on this?

Then, when I print(match), nothing seems to contain on that variable ([]).

I am using Xcode 7 Beta 3, with Swift 2.0.

nhgrif
  • 61,578
  • 25
  • 134
  • 173
Anthony
  • 804
  • 3
  • 12
  • 32

4 Answers4

8

ORIGINAL ANSWER

Here is a function you can leverage to get captured group texts:

import Foundation

extension String {
    func firstMatchIn(string: NSString!, atRangeIndex: Int!) -> String {
        var error : NSError?
        let re = NSRegularExpression(pattern: self, options: .CaseInsensitive, error: &error)
        let match = re.firstMatchInString(string, options: .WithoutAnchoringBounds, range: NSMakeRange(0, string.length))
        return string.substringWithRange(match.rangeAtIndex(atRangeIndex))
    }
}

And then:

var result = "&l=(\\d{8,})".firstMatchIn(yourAccountNumber, atRangeIndex: 1)

The 1 in atRangeIndex: 1 will extract the text captured by (\d{8,}) capture group.

NOTE1: If you plan to extract 8, and only 8 digits after &l=, you do not need the , in the limiting quantifier, as {8,} means 8 or more. Change to {8} if you plan to capture just 8 digits.

NOTE2: NSMatchingAnchored is something you would like to avoid if your expected result is not at the beginning of a search range. See documentation:

Specifies that matches are limited to those at the start of the search range.

NOTE3: Speaking about "simplest" things, I'd advise to avoid using look-arounds whenever you do not have to. Look-arounds usually come at some cost to performance, and if you are not going to capture overlapping text, I'd recommend to use capture groups.

UPDATE FOR SWIFT 2

I have come up with a function that will return all matches with all capturing groups (similar to preg_match_all in PHP). Here is a way to use it for your scenario:

func regMatchGroup(regex: String, text: String) -> [[String]] {
do {
    var resultsFinal = [[String]]()
    let regex = try NSRegularExpression(pattern: regex, options: [])
    let nsString = text as NSString
    let results = regex.matchesInString(text,
        options: [], range: NSMakeRange(0, nsString.length))
    for result in results {
        var internalString = [String]()
        for var i = 0; i < result.numberOfRanges; ++i{
            internalString.append(nsString.substringWithRange(result.rangeAtIndex(i)))
        }
        resultsFinal.append(internalString)
    }
    return resultsFinal
   } catch let error as NSError {
       print("invalid regex: \(error.localizedDescription)")
       return [[]]
   }
}
// USAGE:
let yourAccountNumber = "index.php?page=index&l=99182677"
let matches = regMatchGroup("&l=(\\d{8,})", text: yourAccountNumber)
if (matches.count > 0) // If we have matches....
{ 
    print(matches[0][1]) //  Print the first one, Group 1.
}
Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Is there a good reason to make this an extension and not just a method? – zaph Jul 12 '15 at 14:56
  • @zaph: It only depends on your style, I think :) – Wiktor Stribiżew Jul 12 '15 at 14:58
  • OK, my style is different, it is to make things as simple as possible. ;-) Kent Beck: “Do The Simplest Thing That Could Possibly Work” . – zaph Jul 12 '15 at 15:01
  • 1
    That one is interesting too. I don't know which one to choose now. But I guess it depends on my style. :) Thanks ! – Anthony Jul 12 '15 at 15:15
  • 3
    @zaph you seem to produce this quote whenever someone uses an extension method. I fail to see the link between free functions (as opposed to extensions) and simplicity. Are you saying Beck doesn’t advocate methods, only free functions that take an explicit `this`? – Airspeed Velocity Jul 12 '15 at 15:15
  • The link is additional code. IMO there should be a benefit for extra code. In some cases there are useful benefits. I do believe that making a method also an extension does not make it simpler. It does make a future developer wonder why it was made and extension. IMO there should be a need to make it an extension. – zaph Jul 12 '15 at 15:24
  • About the `{8,}` I did it on purpose because more digits can be present in the future. About `NOTE3`, what do you mean by look-arounds? – Anthony Jul 12 '15 at 16:30
  • 1
    [Look-arounds](http://www.regular-expressions.info/lookaround.html) are "zero-width assertions"... Or just look-aheads or look-behinds. – Wiktor Stribiżew Jul 12 '15 at 16:33
  • 3
    Extensions are not "more complicated" than top level functions. The use case is that you can simply type a dot on a value and scroll through all the methods and extensions for its type. Therefore your extension is more discoverable and more likely to be used. If I had a top-level function whose primary purpose was to work on one piece of data passed in, I would likely move it into an extension on that data type. Also, the ability to chain extensions as you would methods often makes for more readable code. – Rikki Gibson Jul 12 '15 at 17:54
1

It may be easier just to use the NSString method instead of NSRegularExpression.

var yourAccountNumber = "index.php?page=index&l=99182677"
println(yourAccountNumber) // index.php?page=index&l=99182677

let regexString = "(?<=&l=)\\d{8,}+"
let options :NSStringCompareOptions = .RegularExpressionSearch | .CaseInsensitiveSearch
if let range = yourAccountNumber.rangeOfString(regexString, options:options) {
    let digits = yourAccountNumber.substringWithRange(range)
    println("digits: \(digits)")
}
else {
    print("Match not found")
}

The (?<=&l=) means precedes but not part of.

In detail:

Look-behind assertion. True if the parenthesized pattern matches text preceding the current input position, with the last character of the match being the input character just before the current position. Does not alter the input position. The length of possible strings matched by the look-behind pattern must not be unbounded (no * or + operators.)

In general performance considerations of a look-behind without instrumented proof is just premature optimization. That being said there may be other valid reasons for and against look-arounds in regular expressions.

ICU User Guide: Regular Expressions

zaph
  • 111,848
  • 21
  • 189
  • 228
  • Works well, thanks. However in Swift 2.0, `println()` has been replaced by `print()`. Thanks ! – Anthony Jul 12 '15 at 15:11
  • `let options :NSStringCompareOptions = .RegularExpressionSearch | .CaseInsensitiveSearch` would also be `let options: NSStringCompareOptions = [.RegularExpressionSearch, .CaseInsensitiveSearch]` in 2.0 – Airspeed Velocity Jul 12 '15 at 15:37
  • I’d also suggest `let digits = range.map { yourAccountNumber[$0] }`, rather than the force-unwrap, and then some unwrapping technique depending on the use case – Airspeed Velocity Jul 12 '15 at 15:39
  • Unfortunately Xcode beta 3 is crashing for me so I can't use Swift 2.0. Yes, force unwrapping is really bad, I used it only in the sample code, I have updated the answer. – zaph Jul 12 '15 at 15:49
1

For Swift 2, you can use this extension of String:

import Foundation

extension String {
    func firstMatchIn(string: NSString!, atRangeIndex: Int!) -> String {
        do {
            let re = try NSRegularExpression(pattern: self, options: NSRegularExpressionOptions.CaseInsensitive)
            let match = re.firstMatchInString(string as String, options: .WithoutAnchoringBounds, range: NSMakeRange(0, string.length))
            return string.substringWithRange(match!.rangeAtIndex(atRangeIndex))
        } catch {
            return ""
        }
    }
}

You can get the account-number with:

var result = "&l=(\\d{8,})".firstMatchIn(yourAccountNumber, atRangeIndex: 1)
FelixSFD
  • 6,052
  • 10
  • 43
  • 117
0

Replace NSMatchingOptions.Anchored with NSMatchingOptions() (no options)

vadian
  • 274,689
  • 30
  • 353
  • 361