2

i have this text in which i need to extract only the digits on the first line

+1-415-655-0001 US TOLL Access code: 197 703 792

the regex i have just extracts all digits /d+

var noteStr = "1-415-655-0001 US TOLL\n\nAccess code: 197 703 792"

findCodeInWord()

func findCodeInWord() -> String?
{
    let regex = try! NSRegularExpression(pattern: "\\d+", options: [])

    var items = [String]()
    regex.enumerateMatchesInString(noteStr, options: [], range: NSMakeRange(0, noteStr.characters.count)) { result, flag, stop in
        guard let match = result else {
            // result is nil
            return
        }

        let range = match.rangeAtIndex(0)
        var matchStr = (noteStr as NSString).substringWithRange(range)
        print(matchStr)
    }
    return items.joinWithSeparator("")

}

but this returns all the digits. I only want it to return 14156550001

Tony
  • 656
  • 8
  • 20
  • Use `[^0-9]+` as your regex and the replacement string as empty. I don't know Swift but apparently there's a function for that. See http://stackoverflow.com/questions/28503449/swift-replace-substring-regex – shawnt00 Apr 11 '16 at 20:19

2 Answers2

1

"I have this text in which i need to extract only the digits on the first line"

While regex is very useful at times, for a simple task as extracting only number characters from a given string, Swifts native pattern matching is a useful tool; appropriate here as the UnicodeScalar representation of numbers 0 through 9 is in sequence:

var noteStr = "1-415-655-0001 US TOLL\n\nAccess code: 197 703 792"

/* since we're using native pattern matching, let's use a native method
   also when extracting the first row (even if this is somewhat simpler
   if using Foundation bridged NSString methods)                            */
if let firstRowChars = noteStr.characters.split("\n").first,
    case let firstRow = String(firstRowChars) {

    // pattern matching for number characters
    let pattern = UnicodeScalar("0")..."9"
    let numbers = firstRow.unicodeScalars
        .filter { pattern ~= $0 }
        .reduce("") { String($0) + String($1) }

    print(numbers) // 14156550001

    /* Alternatively use .reduce with an inline if clause directly:
    let numbers = firstRow.unicodeScalars
        .reduce("") { pattern ~= $1 ? String($0) + String($1) : String($0)} */
}
dfrib
  • 70,367
  • 12
  • 127
  • 192
1

You can extract these numbers with a single regex based on a \G operator and capturing the digits into Group 1:

\G(?:[^\d\n\r]*(\d+))

See the regex demo, it will only capture into Group 1 digit sequences (1 or more, with (\d+)) that are on the first line due to \G operator that matches at the beginning of the string and then at the end of each successful match and the [^\d\n\r]* character class matching 0+ characters other than digit, LF or CR.

Thus, when it starts matching, 1 is found and captured, then - is matched with [^\d\n\r]*, then 415 is matched and captured, etc. When \n is encountered, no more match is found, the \G anchor fails and thus, the whole regex search stops at the first line.

Swift:

let regex = try! NSRegularExpression(pattern: "\\G(?:[^\\d\n\r]*(\\d+))", options: [])
...
let range = match.rangeAtIndex(1)
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    Thanks wiktor! This was exactly what i was looking for. +1 for explaining it as well. Very helpful! – Tony Apr 12 '16 at 13:32