77

I'm working on a simple RSS Reader app as a beginner project in Xcode. I currently have it set up that it parses the feed, and places the title, pub date, description and content and displays it in a WebView.

I recently decided to show the description (or a truncated version of the content) in the TableView used to select a post. However, when doing so:

cell.textLabel?.text = item.title?.uppercaseString
cell.detailTextLabel?.text = item.itemDescription //.itemDescription is a String

It shows the raw HTML of the post.

I would like to know how to convert the HTML into plain text for just the TableView's detailed UILabel.

Thanks!

Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382
Zaid Syed
  • 773
  • 1
  • 6
  • 5

9 Answers9

249

You can add this extension to convert your html code to a regular string:

edit/update:

Discussion The HTML importer should not be called from a background thread (that is, the options dictionary includes documentType with a value of html). It will try to synchronize with the main thread, fail, and time out. Calling it from the main thread works (but can still time out if the HTML contains references to external resources, which should be avoided at all costs). The HTML import mechanism is meant for implementing something like markdown (that is, text styles, colors, and so on), not for general HTML import.

Xcode 11.4 • Swift 5.2

extension Data {
    var html2AttributedString: NSAttributedString? {
        do {
            return try NSAttributedString(data: self, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil)
        } catch {
            print("error:", error)
            return  nil
        }
    }
    var html2String: String { html2AttributedString?.string ?? "" }
}

extension StringProtocol {
    var html2AttributedString: NSAttributedString? {
        Data(utf8).html2AttributedString
    }
    var html2String: String {
        html2AttributedString?.string ?? ""
    }
}

cell.detailTextLabel?.text = item.itemDescription.html2String
Leo Dabus
  • 229,809
  • 59
  • 489
  • 571
  • I tried, but many HTML will return a string with some hidden stuffs in front, which can't be println, weird. – Lim Thye Chean May 14 '15 at 08:13
  • 22
    this method is very processor heavy – inni Oct 08 '15 at 13:43
  • I'm trying to implement this extension, but I get the following error: "Cannot invoke initializer for type 'NSAttributedString' with an argument list of type '(data: NSData, options: NSDictionary, documentAttributes: _, error: _)'" Do you have any idea what I can do? – fredpi Oct 27 '15 at 20:53
  • @LeoDabus thanks for the answer, I was wondering, is it possible to change the font size? The font returned is quite small and barely readable. I tried adding `NSFontAttributeName` but the output remains the same. I've asked a question here -> http://stackoverflow.com/questions/36427442/nsfontattributename-not-applied-to-nsattributedstring – kye Apr 05 '16 at 14:01
  • 1
    +1 for Swift 3: by default, Xcode ported my code from Swift 2 by converting `NSUTF8StringEncoding` into `String.Encoding.utf8`, but it kept crashing. Thanks to this answer, I was able to fix it by appending `.rawValue` to the `Encoding` enum. – kabiroberai Aug 30 '16 at 11:55
  • Is there any way to apply that to all strings and it does not delete `\n`out of it? – bemeyer Jan 11 '17 at 15:41
  • This is fantastic, thanks! I'm having an issue with images not fitting to UILabel width even when I add `style = "max-width: 100%"` to the img tags. Text behaves appropriately, wrapping at the label's edge. Any advice? - asked question here: https://stackoverflow.com/questions/44753628/uiscrollview-with-uilabel-text-fits-view-but-images-dont/44756163?noredirect=1#comment76536343_44756163 – froggomad Jun 28 '17 at 07:53
  • @LeoDabus. It doesn't work at all. Have you tested it when the html string has   at the start or end ? It just doesn't work – zulkarnain shah Sep 11 '17 at 05:56
  • @zulkarnainshah `"&"` returns "&" as expected and `"nbsp;"` doesn't mean anything so it returns the same thing "nbsp;". of course `" "` will return "&" + "nbsp;". If you would like it to return "& " you need to add the missing & `"& "` – Leo Dabus Sep 11 '17 at 06:23
  • @LeoDabus. My HTML for example is " Hello" which starts with a space and converts to the string "@amp;nbsp;Hello". How to get only "Hello" out of that ? – zulkarnain shah Sep 11 '17 at 06:33
  • @zulkarnainshah How can you get "@amp;nbsp;Hello" from " Hello"? it doesn't make any sense. If you interpret it as html it will drop the leading space. It does return "Hello" for me here from " Hello" – Leo Dabus Sep 11 '17 at 06:41
  • @LeoDabus. I get " Hello" NOT "@amp;nbsp;Hello" – zulkarnain shah Sep 11 '17 at 06:42
  • @zulkarnainshah I don't know what are you doing there but you are probably not using the code I posted. Try creating a new playground file and add only the code I posted. If you are testing it in a real project try cleaning it. – Leo Dabus Sep 11 '17 at 06:44
  • @LeoDabus. Did that all. Still get the same result – zulkarnain shah Sep 11 '17 at 06:47
  • Can you post a screen shot of your playground file containing only this extension, the original string and the result? – Leo Dabus Sep 11 '17 at 07:00
  • @LeoDabus this worked well. But, when I use it inside `cellForRow` then my tableview stucks a bit when scrolling. How to get rid of it? – iRiziya Sep 21 '17 at 11:16
  • 1
    doesn't compile on swift 4 – Hemant Singh Oct 12 '17 at 10:52
  • 1
    This works fine on ios 10, but on ios 11 it does some weird things with the html data like it ignores the the font weight of a custom font. unless explicitly defined. – Gustavo_fringe Oct 27 '17 at 14:47
  • 1
    @LeoDabus I think it was some flakiness in Playgrounds. Closing Xcode and restarting resolved the error I encountered the first time. – Adrian Oct 29 '17 at 13:54
  • I'm pretty sure you should not execute this code from background thread. Check out Apple Docs: https://developer.apple.com/documentation/foundation/nsattributedstring/1524613-initwithdata – Timur Bernikovich Nov 17 '17 at 14:24
  • @TimurBernikowich I am not executing anything from the background thread. If you think this is important in your case make sure to just wrap it in a `DispatchQueue.main.async { // your code }` Btw OP it is using it in a table view cellForRowAt indexPath method which is obviously run in the main thread. – Leo Dabus Nov 17 '17 at 14:29
  • I understand this, but someone can try to use it in background just like any other string func. I just left warning. – Timur Bernikovich Nov 17 '17 at 14:34
7

Swift 4, Xcode 9

extension String {
    
    var utfData: Data {
        return Data(utf8)
    }
    
    var attributedHtmlString: NSAttributedString? {
        
        do {
            return try NSAttributedString(data: utfData, options: [
              .documentType: NSAttributedString.DocumentType.html,
              .characterEncoding: String.Encoding.utf8.rawValue
            ], 
            documentAttributes: nil)
        } catch {
            print("Error:", error)
            return nil
        }
    }
}

extension UILabel {
   func setAttributedHtmlText(_ html: String) {
      if let attributedText = html.attributedHtmlString {
         self.attributedText = attributedText
      } 
   }
}
Suhit Patil
  • 11,748
  • 3
  • 50
  • 60
3

Here is my suggested answer. Instead of extension, if you want to put inside function.

func decodeString(encodedString:String) -> NSAttributedString?
    {
        let encodedData = encodedString.dataUsingEncoding(NSUTF8StringEncoding)!
        do {
            return try NSAttributedString(data: encodedData, options: [NSDocumentTypeDocumentAttribute:NSHTMLTextDocumentType,NSCharacterEncodingDocumentAttribute:NSUTF8StringEncoding], documentAttributes: nil)
        } catch let error as NSError {
            print(error.localizedDescription)
            return nil
        }
    }

And call that function and cast NSAttributedString to String

let attributedString = self.decodeString(encodedString)
let message = attributedString.string
Danboz
  • 561
  • 1
  • 5
  • 14
2

Swift4.0 Extension

 extension String {
    var html2AttributedString: String? {
    guard let data = data(using: .utf8) else { return nil }
    do {
        return try NSAttributedString(data: data, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil).string

    } catch let error as NSError {
        print(error.localizedDescription)
        return  nil
    }
  }
}
Maulik Patel
  • 2,045
  • 17
  • 24
1

Please test with this code for the detailTextLabel:

var attrStr = NSAttributedString(
        data: item.itemDescription.dataUsingEncoding(NSUnicodeStringEncoding, allowLossyConversion: true),
        options: [ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],
        documentAttributes: nil,
        error: nil)
cell.detailTextLabel?.text = attrStr
Altimir Antonov
  • 4,966
  • 4
  • 24
  • 26
  • Hi @AltimirAntonov, thanks for the reply. `item.itemDescription` is a String - perhaps I should have clarified that earlier. Should I convert it to NSData? – Zaid Syed Jan 25 '15 at 01:48
1

Try this solution in swift3

extension String{
    func convertHtml() -> NSAttributedString{
        guard let data = data(using: .utf8) else { return NSAttributedString() }
        do{
            return try NSAttributedString(data: data, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue], documentAttributes: nil)
        }catch{
            return NSAttributedString()
        }
    }
}

To use

self.lblValDesc.attributedText = str_postdescription.convertHtml()
Hardik Thakkar
  • 15,269
  • 2
  • 94
  • 81
0

i have used Danboz answer, only changed it to return a simple String (not a rich text string):

static func htmlToText(encodedString:String) -> String?
{
    let encodedData = encodedString.dataUsingEncoding(NSUTF8StringEncoding)!
    do
    {
        return try NSAttributedString(data: encodedData, options: [NSDocumentTypeDocumentAttribute:NSHTMLTextDocumentType,NSCharacterEncodingDocumentAttribute:NSUTF8StringEncoding], documentAttributes: nil).string
    } catch let error as NSError {
        print(error.localizedDescription)
        return nil
    }
}

for me, it works like a charm, thanks Danboz

Shaybc
  • 2,628
  • 29
  • 43
0
let content = givenString // html included string
let attrStr = try! NSAttributedString(data: content.data(using: String.Encoding.unicode, allowLossyConversion: true)!,options: [ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],documentAttributes: nil)
self.labelName.attributedText = attrStr    
shahana mh
  • 67
  • 5
0

Swift 5.*

Here's a compact solution based on string extension:

import UIKit

extension String {
    var attributedHtmlString: NSAttributedString? {
        try? NSAttributedString(
            data: Data(utf8),
            options: [
                .documentType: NSAttributedString.DocumentType.html,
                .characterEncoding: String.Encoding.utf8.rawValue
            ],
            documentAttributes: nil
        )
    }
}

Usage:

let html = "hello <br><br/> <b>world</b>"
if let attributedText = html.attributedHtmlString {
    print(attributedText.string) // "hello \n\nworld\n"
}

You can also keep the attributed string ofc, based on your necessities

Alessandro Francucci
  • 1,528
  • 17
  • 25