2

I am trying to parse html from a URL:

func fetch(url: URL, completion: @escaping ((Result) -> Void)) {
    var request = URLRequest(url: url)
    request.httpMethod = "GET"
    let session = URLSession.init(configuration: URLSessionConfiguration.default)
    session.dataTask(with: request) { [weak self] data, _, error in
        guard let self = self else { return }
        
        if let error = error {
            completion(.failure(error))
            return
        }
        
        if let data = data, let html = String(data: data, encoding: .ascii) {
            completion(.success(self.metaTagsDictionary(for: html)))
            return
        } else {
            completion(.failure(ParseError.fail))
            return
        }
    }.resume()
}

I then print the result with:

dict.keys.forEach { print(dict[$0]) }

However I seem to be getting a bunch of weird characters in the string e.g:

命,科科都能å•ï¼ä¾†è©¦è©¦ 2020 年商周報導的最新家教模å¼å§ã€‚") any idea what this is? Am I using the wrong encoding?

Lou Franco
  • 87,846
  • 14
  • 132
  • 192
Kex
  • 8,023
  • 9
  • 56
  • 129

1 Answers1

3

You're decoding in .ascii which is almost certainly not correct for this data. Most web pages are encoded in UTF-8 (.utf8), but there are other options. It depends on the site. But I would start with UTF-8. If that returns nil, then you will need to investigate the site and determine what encoding it uses.

Rob Napier
  • 286,113
  • 34
  • 456
  • 610