4

I'm trying to parse a Localizable.string file for a small project in swift on MacOS. I just want to retrieve all the keys and values inside a file to sort them into a dictionary.

To do so I used regex with the NSRegularExpression cocoa class.

Here is what those file look like :

"key 1" = "Value 1";
"key 2" = "Value 2";
"key 3" = "Value 3";

Here is my code that is supposed to get the keys and values from the file loaded into a String :

static func getDictionaryFormText(text: String) -> [String: String] {
    var dict: [String : String] = [:]
    let exp = "\"(.*)\"[ ]*=[ ]*\"(.*)\";"

    for line in text.components(separatedBy: "\n") {
        let match = self.matches(for: exp, in: line)
        // Following line can be uncommented when working
        //dict[match[0]] = match[1]
        print("(\(match.count)) matches = \(match)")
    }
    return dict
}

static func matches(for regex: String, in text: String) -> [String] {
    do {
        let regex = try NSRegularExpression(pattern: regex)
        let nsString = text as NSString
        let results = regex.matches(in: text, range: NSRange(location: 0, length: nsString.length))
        return results.map { nsString.substring(with: $0.range) }
    } catch let error as NSError {
        print("invalid regex: \(error.localizedDescription)")
        return []
    }
}

When running this code with the provided Localizable example here is the output :

(1) matches = ["\"key 1\" = \"Value 1\";"]
(1) matches = ["\"key 2\" = \"Value 2\";"]
(1) matches = ["\"key 3\" = \"Value 3\";"]

It sounds like the match doesn't stop after the first " occurence. When i try the same expression \"(.*)\"[ ]*=[ ]*\"(.*)\"; on regex101.com the output is correct though. What am i doing wrong ?

Edgar
  • 203
  • 2
  • 9
  • 1
    Your output seems to be showing the right result. Maybe you are mis-understanding the specification of `NSRegularExpression` or `NSTextCheckingResult`. What output do you expect with your code for that input text? – OOPer Sep 26 '16 at 14:51
  • 1
    [*Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.* - Jamie Zawinski](http://regex.info/blog/2006-09-15/247) –  Sep 26 '16 at 15:14
  • You're right, i actually fixed this problem after posting, i was just accessing the first result because of a misunderstanding on how `NSTextCheckingResult` a conceived. I'll update my question. – Edgar Sep 26 '16 at 15:21

3 Answers3

13

Your function (from Swift extract regex matches ?) matches the entire pattern only. If you are interested in the particular capture groups then you have to access them with rangeAt() as for example in Convert a JavaScript Regex to a Swift Regex (not yet updated for Swift 3).

However there is a much simpler solution, because .strings files actually use one possible format of property lists, and can be directly read into a dictionary. Example:

if let url = Bundle.main.url(forResource: "Localizable", withExtension: "strings"),
    let stringsDict = NSDictionary(contentsOf: url) as? [String: String] {
    print(stringsDict)
}

Output:

["key 1": "Value 1", "key 2": "Value 2", "key 3": "Value 3"]
Community
  • 1
  • 1
Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382
  • Thanks Martin, it's fixed. I'll post my code in case it helps other folks. I didn't know about the NSDictionary solution though, that's awesome ! – Edgar Sep 26 '16 at 15:25
  • Wow, that's actually awesome that is works like that. Sadly, haven't found a way to save `context` (comments above lines) using this :/ – Jakub Nov 07 '19 at 14:36
  • @Jakub: Yes, the comments are ignored when reading the file as a property list. I am not aware of a workaround (other than parsing the file “manually”). – Martin R Nov 07 '19 at 14:39
1

For anyone interested I got the original function working. I needed it for a small command-line script where the NSDictionary(contentsOf: URL) wasn't working.

func matches(for regex: String, in text: String) -> [String] {
    do {
        let regex = try NSRegularExpression(pattern: regex)
        let nsString = text as NSString
        guard let result = regex.firstMatch(in: text, options: [], range: NSRange(location: 0, length: nsString.length)) else {
            return [] // pattern does not match the string
        }
        return (1 ..< result.numberOfRanges).map {
            nsString.substring(with: result.range(at: $0))
        }
    } catch let error as NSError {
        print("invalid regex: \(error.localizedDescription)")
        return []
    }
}

func getParsedText(text: String) -> [(key: String, text: String)] {
    var dict: [(key: String, text: String)] = []
    let exp = "\"(.*)\"[ ]*=[ ]*\"(.*)\";"

    for line in text.components(separatedBy: "\n") {
        let match = matches(for: exp, in: line)
        if match.count == 2 {
            dict.append((key: match[0], text: match[1]))
        }
    }
    return dict
}

Call it using something like this.

let text = try! String(contentsOf: url, encoding: .utf8)

let stringDict = getParsedText(text: text)
thecoolwinter
  • 861
  • 7
  • 22
0

Really nice solution parsing directly to dictionary, but if someone wants to also parse the comments you can use a small library I made for this csv2strings.

import libcsv2strings

let contents: StringsFile = StringsFileParser(stringsFilePath: "path/to/Localizable.strings")?.parse()

It parses the file to a StringsFile model

/// Top level model of a Apple's strings file
public struct StringsFile {
    let entries: [Translation]

    /// Model of a strings file translation item
    public struct Translation {
        let translationKey: String
        let translation: String
        let comment: String?
    }
}
Christos Koninis
  • 1,499
  • 1
  • 15
  • 28