2

Introduction

Hey there! In my app I am making requests to the YouTubeDataAPI. The api is capable of responding with UTF8-Encoded Strings (special characters included). However, I am unable to receive the data as utf8 data.

In order to parse the responded data into an Object, I am using Swift's codable protocol.

This is what my request looks like

enum VideoPart: String {
    case snippet = "snippet"
    case statistics = "statistics"
    case contentDetails = "contentDetails"
}

private static func fetchDetailsAfterSearch(forVideo videoId: String, parts: [VideoPart], onDone: @escaping (JSON) -> Void) {
        let videoParts = parts.map({ $0.rawValue })

        let apiUrl = URL(string: "https://www.googleapis.com/youtube/v3/videos")

        let headers: HTTPHeaders = ["X-Ios-Bundle-Identifier": Bundle.main.bundleIdentifier ?? ""]

        let parameters: Parameters = ["part": videoParts.joined(separator: ","), "id": videoId, "key": apiKey]

        Alamofire.request(apiUrl!, method: .get, parameters: parameters, encoding: URLEncoding.default, headers: headers).responseJSON { (response) in
            if let responseData = response.data {
                onDone(JSON(responseData))
            }
        }
    }

static func searchVideos(forQuery query: String, limit: Int = 20, onDone: @escaping ([YTVideo]) -> Void) {

    let apiUrl = URL(string: "https://www.googleapis.com/youtube/v3/search")!

    let headers: HTTPHeaders = ["X-Ios-Bundle-Identifier": Bundle.main.bundleIdentifier ?? ""]

    let parameters: Parameters = ["q": query, "part": "snippet", "maxResults": limit, "relevanceLanguage": "en", "type": "video", "key": apiKey]

    let group = DispatchGroup()
    group.enter()

    var videos: [YTVideo] = [] // the parsed videos are stored here

    Alamofire.request(apiUrl, method: .get, parameters: parameters, encoding: URLEncoding.default, headers: headers).responseJSON { (response) in

        if let responseData = response.data { // is there a response data?
            let resultVideos = JSON(responseData)["items"].arrayValue

            resultVideos.forEach({ (v) in // loop through each video and fetch more exact data, based on the videoId
                let videoId = v["id"]["videoId"].stringValue
                group.enter()
                YTDataService.fetchDetailsAfterSearch(forVideo: videoId, parts: [VideoPart.statistics, VideoPart.contentDetails], onDone: {(details) in
                    // MARK: parse the data of the api to the YTVideo Object
                    let videoSnippet = v["snippet"]
                    let videoDetails = details["items"][0]

                    var finalJSON: JSON = JSON()

                    finalJSON = finalJSON.merged(other: videoSnippet)
                    finalJSON = finalJSON.merged(other: videoDetails)


                    if let video = try? YTVideo(data: finalJSON.rawData()) {
                        videos.append(video)
                    }
                    group.leave()
                })
            })
            group.leave()
        }
    }

    group.notify(queue: .main) {
        onDone(videos)
    }
}

Code explanation:

As the api only returns the snippet of the video, I have to make another api request for each video to fetch more details. This request is made inside a for-loop for each video. This call then returns a Data Object that is parsed to a JSON Object (by SwiftyJSON).

Those two responses are then merged together into one JSON object. After that, the finalJson is used to initialize a YTVideo Object. As I already said, the class is codable and parses the json automatically to its needs - the structure of the class can be found below.

The data that is sent back from the API:

{
  "statistics" : {
    "favoriteCount" : "0",
    "dislikeCount" : "942232",
    "likeCount" : "8621179",
    "commentCount" : "516305",
    "viewCount" : "2816892915"
  },
  "publishedAt" : "2014-08-18T21:18:00.000Z",
  "contentDetails" : {
    "caption" : "false",
    "licensedContent" : true,
    "definition" : "hd",
    "duration" : "PT4M2S",
    "dimension" : "2d",
    "projection" : "rectangular"
  },
  "channelId" : "UCANLZYMidaCbLQFWXBC95Jg",
  "kind" : "youtube#video",
  "id" : "nfWlot6h_JM",
  "liveBroadcastContent" : "none",
  "etag" : "\"8jEFfXBrqiSrcF6Ee7MQuz8XuAM\/ChcYFUcK77KQsdMIp5DyWCHvX9I\"",
  "title" : "Taylor Swift - Shake It Off",
  "channelTitle" : "TaylorSwiftVEVO",
  "description" : "Music video by Taylor Swift performing Shake It Off. (C) 2014 Big Machine Records, LLC. New single ME! (feat. Brendon Urie of Panic! At The Disco) available ...",
  "thumbnails" : {
    "high" : {
      "width" : 480,
      "url" : "https:\/\/i.ytimg.com\/vi\/nfWlot6h_JM\/hqdefault.jpg",
      "height" : 360
    },
    "medium" : {
      "url" : "https:\/\/i.ytimg.com\/vi\/nfWlot6h_JM\/mqdefault.jpg",
      "width" : 320,
      "height" : 180
    },
    "default" : {
      "url" : "https:\/\/i.ytimg.com\/vi\/nfWlot6h_JM\/default.jpg",
      "width" : 120,
      "height" : 90
    }
  }
}

This is my YTVideo class

// This file was generated from JSON Schema using quicktype, do not modify it directly.
// To parse the JSON, add this file to your project and do:
//
//   let yTVideo = try YTVideo(json)

import Foundation

// MARK: - YTVideo
struct YTVideo: Codable {
    let statistics: Statistics
    let publishedAt: String
    let contentDetails: ContentDetails
    let channelID, kind, id, liveBroadcastContent: String
    let etag, title, channelTitle, ytVideoDescription: String
    let thumbnails: Thumbnails

    enum CodingKeys: String, CodingKey {
        case statistics, publishedAt, contentDetails
        case channelID = "channelId"
        case kind, id, liveBroadcastContent, etag, title, channelTitle
        case ytVideoDescription = "description"
        case thumbnails
    }
}

// MARK: YTVideo convenience initializers and mutators

extension YTVideo {
    init(data: Data) throws {
        self = try newJSONDecoder().decode(YTVideo.self, from: data)
    }

    init(_ json: String, using encoding: String.Encoding = .utf8) throws {
        guard let data = json.data(using: encoding) else {
            throw NSError(domain: "JSONDecoding", code: 0, userInfo: nil)
        }
        try self.init(data: data)
    }

    init(fromURL url: URL) throws {
        try self.init(data: try Data(contentsOf: url))
    }

    func with(
        statistics: Statistics? = nil,
        publishedAt: String? = nil,
        contentDetails: ContentDetails? = nil,
        channelID: String? = nil,
        kind: String? = nil,
        id: String? = nil,
        liveBroadcastContent: String? = nil,
        etag: String? = nil,
        title: String? = nil,
        channelTitle: String? = nil,
        ytVideoDescription: String? = nil,
        thumbnails: Thumbnails? = nil
    ) -> YTVideo {
        return YTVideo(
            statistics: statistics ?? self.statistics,
            publishedAt: publishedAt ?? self.publishedAt,
            contentDetails: contentDetails ?? self.contentDetails,
            channelID: channelID ?? self.channelID,
            kind: kind ?? self.kind,
            id: id ?? self.id,
            liveBroadcastContent: liveBroadcastContent ?? self.liveBroadcastContent,
            etag: etag ?? self.etag,
            title: title ?? self.title,
            channelTitle: channelTitle ?? self.channelTitle,
            ytVideoDescription: ytVideoDescription ?? self.ytVideoDescription,
            thumbnails: thumbnails ?? self.thumbnails
        )
    }

    func jsonData() throws -> Data {
        return try newJSONEncoder().encode(self)
    }

    func jsonString(encoding: String.Encoding = .utf8) throws -> String? {
        return String(data: try self.jsonData(), encoding: encoding)
    }
}

// MARK: - ContentDetails
struct ContentDetails: Codable {
    let caption: String
    let licensedContent: Bool
    let definition, duration, dimension, projection: String
}

// MARK: ContentDetails convenience initializers and mutators

extension ContentDetails {
    init(data: Data) throws {
        self = try newJSONDecoder().decode(ContentDetails.self, from: data)
    }

    init(_ json: String, using encoding: String.Encoding = .utf8) throws {
        guard let data = json.data(using: encoding) else {
            throw NSError(domain: "JSONDecoding", code: 0, userInfo: nil)
        }
        try self.init(data: data)
    }

    init(fromURL url: URL) throws {
        try self.init(data: try Data(contentsOf: url))
    }

    func with(
        caption: String? = nil,
        licensedContent: Bool? = nil,
        definition: String? = nil,
        duration: String? = nil,
        dimension: String? = nil,
        projection: String? = nil
    ) -> ContentDetails {
        return ContentDetails(
            caption: caption ?? self.caption,
            licensedContent: licensedContent ?? self.licensedContent,
            definition: definition ?? self.definition,
            duration: duration ?? self.duration,
            dimension: dimension ?? self.dimension,
            projection: projection ?? self.projection
        )
    }

    func jsonData() throws -> Data {
        return try newJSONEncoder().encode(self)
    }

    func jsonString(encoding: String.Encoding = .utf8) throws -> String? {
        return String(data: try self.jsonData(), encoding: encoding)
    }
}

// MARK: - Statistics
struct Statistics: Codable {
    let favoriteCount, dislikeCount, likeCount, commentCount: String
    let viewCount: String
}

// MARK: Statistics convenience initializers and mutators

extension Statistics {
    init(data: Data) throws {
        self = try newJSONDecoder().decode(Statistics.self, from: data)
    }

    init(_ json: String, using encoding: String.Encoding = .utf8) throws {
        guard let data = json.data(using: encoding) else {
            throw NSError(domain: "JSONDecoding", code: 0, userInfo: nil)
        }
        try self.init(data: data)
    }

    init(fromURL url: URL) throws {
        try self.init(data: try Data(contentsOf: url))
    }

    func with(
        favoriteCount: String? = nil,
        dislikeCount: String? = nil,
        likeCount: String? = nil,
        commentCount: String? = nil,
        viewCount: String? = nil
    ) -> Statistics {
        return Statistics(
            favoriteCount: favoriteCount ?? self.favoriteCount,
            dislikeCount: dislikeCount ?? self.dislikeCount,
            likeCount: likeCount ?? self.likeCount,
            commentCount: commentCount ?? self.commentCount,
            viewCount: viewCount ?? self.viewCount
        )
    }

    func jsonData() throws -> Data {
        return try newJSONEncoder().encode(self)
    }

    func jsonString(encoding: String.Encoding = .utf8) throws -> String? {
        return String(data: try self.jsonData(), encoding: encoding)
    }
}

// MARK: - Thumbnails
struct Thumbnails: Codable {
    let high, medium, thumbnailsDefault: Default

    enum CodingKeys: String, CodingKey {
        case high, medium
        case thumbnailsDefault = "default"
    }
}

// MARK: Thumbnails convenience initializers and mutators

extension Thumbnails {
    init(data: Data) throws {
        self = try newJSONDecoder().decode(Thumbnails.self, from: data)
    }

    init(_ json: String, using encoding: String.Encoding = .utf8) throws {
        guard let data = json.data(using: encoding) else {
            throw NSError(domain: "JSONDecoding", code: 0, userInfo: nil)
        }
        try self.init(data: data)
    }

    init(fromURL url: URL) throws {
        try self.init(data: try Data(contentsOf: url))
    }

    func with(
        high: Default? = nil,
        medium: Default? = nil,
        thumbnailsDefault: Default? = nil
    ) -> Thumbnails {
        return Thumbnails(
            high: high ?? self.high,
            medium: medium ?? self.medium,
            thumbnailsDefault: thumbnailsDefault ?? self.thumbnailsDefault
        )
    }

    func jsonData() throws -> Data {
        return try newJSONEncoder().encode(self)
    }

    func jsonString(encoding: String.Encoding = .utf8) throws -> String? {
        return String(data: try self.jsonData(), encoding: encoding)
    }
}

// MARK: - Default
struct Default: Codable {
    let width: Int
    let url: String
    let height: Int
}

// MARK: Default convenience initializers and mutators

extension Default {
    init(data: Data) throws {
        self = try newJSONDecoder().decode(Default.self, from: data)
    }

    init(_ json: String, using encoding: String.Encoding = .utf8) throws {
        guard let data = json.data(using: encoding) else {
            throw NSError(domain: "JSONDecoding", code: 0, userInfo: nil)
        }
        try self.init(data: data)
    }

    init(fromURL url: URL) throws {
        try self.init(data: try Data(contentsOf: url))
    }

    func with(
        width: Int? = nil,
        url: String? = nil,
        height: Int? = nil
    ) -> Default {
        return Default(
            width: width ?? self.width,
            url: url ?? self.url,
            height: height ?? self.height
        )
    }

    func jsonData() throws -> Data {
        return try newJSONEncoder().encode(self)
    }

    func jsonString(encoding: String.Encoding = .utf8) throws -> String? {
        return String(data: try self.jsonData(), encoding: encoding)
    }
}

// MARK: - Helper functions for creating encoders and decoders

func newJSONDecoder() -> JSONDecoder {
    let decoder = JSONDecoder()
    if #available(iOS 10.0, OSX 10.12, tvOS 10.0, watchOS 3.0, *) {
        decoder.dateDecodingStrategy = .iso8601
    }
    return decoder
}

func newJSONEncoder() -> JSONEncoder {
    let encoder = JSONEncoder()
    if #available(iOS 10.0, OSX 10.12, tvOS 10.0, watchOS 3.0, *) {
        encoder.dateEncodingStrategy = .iso8601
    }
    return encoder
}

What I currently have:

The parsing and everything is working fine, however the Youtube-Video-Title is not displayed in utf8 (see image below).

The result I get

What I want

What changes do I have to make in order to display the data from the YouTube API as a valid utf8 encoded string? I tried several utf8-encodings but none of them worked for me:

Any help would be appreciated!

linus_hologram
  • 1,595
  • 13
  • 38
  • 2
    In the code you are clearly using `SwiftyJSON`, not `Codable`. Both APIs decode UTF8 encoded data properly by default – vadian Sep 12 '19 at 15:03
  • https://stackoverflow.com/questions/55385560/how-to-fix-youtube-api-results-title-that-are-returned-encoded + https://stackoverflow.com/questions/42288963/encode-string-to-html-string-swift-3 ? Not optimized but you could do: `if let responseData = response.data, let escapedReponseString = String(data: responseData, encoding: .utf8), let responseString = CheckLastLinkToInterpretThem, let dataToUseForSwiftJSON = responseString.encoding(utf8)` – Larme Sep 12 '19 at 15:04
  • @vadian - correct, I am using SwiftyJSON to parse the responses together - but then I initialize the YTVideo class with the rawData of the finalJson. – linus_hologram Sep 12 '19 at 15:05
  • @Larme would you mind posting an answer? – linus_hologram Sep 12 '19 at 15:08
  • It'd be helpful if you posted an example of what the raw JSON from the API looks like. Your code to parse it is only half the equation. – Craig Siemens Sep 12 '19 at 15:42
  • `'` is not a UTF-8 encoding problem: this is HTML encoding. So look more in a direction of answers like this: https://stackoverflow.com/questions/25607247/how-do-i-decode-html-entities-in-swift – timbre timbre Sep 12 '19 at 16:04
  • @CraigSiemens I did. Please take a look at my updated post. Thanks for your help! – linus_hologram Sep 12 '19 at 16:57
  • @KirilS. so you are conviced that the data I receive from the api looks like this? In case you might want to see it, I added the json-data I get from the api as well as the pure code of my `YTVideo` class. – linus_hologram Sep 12 '19 at 16:59
  • Does the JSON you posted have the same issue? It doesn't match the screenshot in your question. – Craig Siemens Sep 12 '19 at 17:23
  • @CraigSiemens it is not the same video and not the pure data. The example I sent is the data as I parse it (merge it) together from my two api calls. The problem is that I can't tell how the native api data looks like, as the YouTube Data API example api from the documentation seems to parse that html-encoding-ish stuff away before they display it in the browser. Here is the link to the test-api: https://developers.google.com/youtube/v3/docs/search/list https://developers.google.com/youtube/v3/docs/videos/list – linus_hologram Sep 12 '19 at 17:26
  • @CraigSiemens I mean if you take a look at the thumbnail links you'll see that they have this \ before every normal slash - is this already this html encoding stuff you mentioned ? – linus_hologram Sep 12 '19 at 17:38

3 Answers3

1

the api response did include html encoded characters. see the below screenshot:

enter image description here

youtube demo console link : https://developers.google.com/youtube/v3/docs/search/list?apix_params=%7B%22part%22%3A%22snippet%22%2C%22maxResults%22%3A20%2C%22q%22%3A%22Taylor%20Swift%22%2C%22relevanceLanguage%22%3A%22en%22%2C%22type%22%3A%22video%22%7D

conclusion : the api document doesn't state the returned text is plain text / html encoded. However, base on the demo console result, the title is html encoded.

  • @ Angel.Alice: This is a known and documented issue of the API (see the threads mentioned in the comments above, particularly [this one](https://stackoverflow.com/q/55385560)). Users have to decode the HTML characters references (aka HTML entities) by themselves using tools available from the surrounding programming environment. – stvar Sep 12 '19 at 18:16
1

Hope this help you:

extension String {
func htmlToUtf8() -> String{
    //chuyển đổi kết quả từ JSON htmlString sang Utf8
    let encodedData = self.data(using: .utf8)
    let attributedOptions : [NSAttributedString.DocumentReadingOptionKey : Any ] = [
        .documentType: NSAttributedString.DocumentType.html,
        .characterEncoding: String.Encoding.utf8.rawValue ]
    do {
        let attributedString = try NSAttributedString(data: encodedData!, options: attributedOptions, documentAttributes: nil)
        let decodedString = attributedString.string
        return decodedString
    } catch {
        // error ...
    }

    return String()
}

}

And then:

let jsonTitle = "ERIK - 'Em Kh\U00f4ng Sai, Ch\U00fang Ta Sai' (Official Lyric Video)"
let videoTitle = jsonTitle.htmlToUtf8()
print(videoTitle) //"ERIK - 'Em Không Sai, Chúng Ta Sai' (Official Lyric Video)"

I'm from VietNam, so we use utf8 a lot.

Tung Dang
  • 246
  • 3
  • 7
0

This is not a UTF-8 or parsing problem. Your code is correctly parsing and displaying the string it is given. The problem appears to be that the string you're using is HTML-encoded. Now, I don't think you've shared enough code (and QuickType isn't loading for me) for us to know which properties you're using to get the HTML-encoded video title. It could be that there's a plain-text one, or you're expected to handle the decoding yourself – I can't tell from the documentation.

In short, if the HTML-encoded string is your only option, look at decoding HTML entities instead of unicode-related problems.

Drarok
  • 3,612
  • 2
  • 31
  • 48
  • this is a comment, not an answer – timbre timbre Sep 12 '19 at 16:26
  • The string is not HTML-encoded - only specific characters are replaced with these strange characters - there are no html tags anywhere in the response. The thing that surprises me is, that if you test e.g. the API on the official documentation page, it responds totally fine - I am not sure where the problem is to be honest. I just know that it is probably not related to html-encoding as there are no tags in the response. – linus_hologram Sep 12 '19 at 16:49
  • I also put the whole code of the `YTVideo`class in my post as well as the response I receive from the api. – linus_hologram Sep 12 '19 at 16:56
  • 1
    @ linus_hologram: These characters **are not** strange at all. They are valid [HTML character references](https://html.spec.whatwg.org/multipage/syntax.html#character-references). The only issue here is that the API does not decode these into UTF-8. This API issue was documented on this forum for some time (as already mentioned in the comments above). Your have to decode all HTML character references (aka entities) by yourself. – stvar Sep 12 '19 at 18:05