I'm encountering some weird edge cases when parsing JSON data from the Google search autocomplete API. This is my model for decoding the JSON data:
struct suggestOutputModel: Decodable {
let query: String
let suggestions: [String]?
let thirdValue: [String]?
let fourthValue: GoogleSuggestSubtypes?
struct GoogleSuggestSubtypes: Decodable {
let googlesuggestsubtypes: [[Int]]
enum CodingKeys: String, CodingKey {
case googlesuggestsubtypes = "google:suggestsubtypes"
}
}
init(from decoder: Decoder) throws {
var container = try decoder.unkeyedContainer()
query = try container.decode(String.self)
suggestions = try container.decode([String]?.self)
thirdValue = try container.decodeIfPresent([String].self)
fourthValue = try container.decodeIfPresent(GoogleSuggestSubtypes.self)
}
}
Most of the time this works. But when I query the API with (
and attempt to parse the response, I get the following error:
Swift.DecodingError.dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid JSON.", underlyingError: Optional(Error Domain=NSCocoaErrorDomain Code=3840 "Unable to convert data to string around line 1, column 100." UserInfo={NSDebugDescription=Unable to convert data to string around line 1, column 100., NSJSONSerializationErrorIndex=100})))
This is what the JSON response I receive from the API looks like:
["(", ["(x2+y2-1)x2y3\u003d0", "(", "(g)i-dle", "(a-b)^2", "(a+b)^3", "(g)i-dle nxde lyrics", "(a+b)(a-b)", "( ͡° ͜ʖ ͡°)", "(working title) riverside menu", "(working title) burger bar menu"],
[], {
"google:suggestsubtypes": [
[512, 433, 131],
[512, 433],
[512, 433, 131],
[433],
[512],
[512],
[512],
[512],
[512],
[512]
]
}
]
There are some uncommon characters, so maybe that's the issue, though none of them seem UTF-8 incompatible? I also tried running file -I
on the .txt file I received from the API when calling it in the browser, and it reports the encoding as UTF-8. In any case, the following test suggests the problem is indeed trying to decode using UTF-8:
let utf8Test: String = String(data: data, encoding: .utf8)! // Fatal error: Unexpectedly found nil while unwrapping an Optional value
let latin1Test: String = String(data: data, encoding: .isoLatin1)! // Succeeds, although some characters are represented in the \u format
So I then try manually decoding the JSON array containing uncommon characters (the suggestions) to String using Latin-1, re-encoding to UTF-8 data, and then decoding again to [String], as such:
init(from decoder: Decoder) throws {
var container = try decoder.unkeyedContainer()
query = try container.decode(String.self)
do {
suggestions = try container.decode([String]?.self)
} catch { // If undecodable, try converting data from latin1 to utf8 and redecoding
suggestions = {
let latin1Data: Data = try! container.decode(Data.self)
let string: String = String(data: latin1Data, encoding: .isoLatin1)!
let utf8Data: Data = string.data(using: .utf8)!
return try! JSONDecoder().decode([String]?.self, from: utf8Data)
}()
}
thirdValue = try container.decodeIfPresent([String].self)
fourthValue = try container.decodeIfPresent(GoogleSuggestSubtypes.self)
}
However this throws the exact same error I received before, and I'm now out of ideas for how to solve this. Would also appreciate explanations for why I'm encountering this problem - if the file I receive from the API is ostensibly encoded in UTF-8, why would decoding with UTF-8 fail while Latin-1 succeeds?