2

I'm loading a text file, the encoding is unknown as it comes from other sources. The content itself comes from macOS NSDocument's read method, which is fed into my model's read. The String constructor requires the encoding when using Data, if you assume the incorrect one you may get a null. I've created a conditional cascade of potential encodings (it's what other people seem to be doing), there's gotta be a better way to do this. Suggestions?

    override func read(from data: Data, ofType typeName: String) throws {
        model.read(from: data, ofType: typeName)
    }

In the model:

    func read(from data: Data, ofType typeName: String) {
        if let text = String(data: data, encoding: .utf8) {
            content = text
        } else if let text = String(data: data, encoding: .macOSRoman) {
            content = text
        } else if let text = String(data: data, encoding: .ascii) {
            content = text
        } else {
            content = "?????"
        }
    }
RobMac
  • 793
  • 9
  • 13
  • If your text is coming from the web you can check this post https://stackoverflow.com/a/34687962/2303865 – Leo Dabus Jan 21 '20 at 01:31
  • Thanks @LeoDabus for the suggestion, unfortunately it does not come from the web. It's a regular text file on the file system, hence NSDocument. – RobMac Jan 21 '20 at 14:02
  • There is a static method on NSString to guess the encoding – Leo Dabus Jan 21 '20 at 14:03
  • @LeoDabus unless I'm missing something, NSString also requires the encoding to be specified. Do you have a link to the documentation on the specific factory or constructor method? – RobMac Jan 21 '20 at 14:17

1 Answers1

8

You can extend Data and create a stringEncoding property to try to detect the string encoding. Try like this:

extension Data {
    var stringEncoding: String.Encoding? {
        var nsString: NSString?
        guard case let rawValue = NSString.stringEncoding(for: self, encodingOptions: nil, convertedString: &nsString, usedLossyConversion: nil), rawValue != 0 else { return nil }
        return .init(rawValue: rawValue)
    }
}

Then you can simply pass data.stringEncoding to the String initialer:

if let string = String(data: data, encoding: data.stringEncoding) {
    print(string)
}
Leo Dabus
  • 229,809
  • 59
  • 489
  • 571
  • 1
    I chose this answer as the right one, because it is the right way to do it according to the documentation, but somehow keeps coming back with incorrect encodings. So, the issue is probably a bug within Apple's implementation. – RobMac Feb 13 '20 at 15:16