4

I'm looking for a way to replace emoji characters with their description in a Swift string.

Example:

Input "This is my string "

I'd like to replace the to get:

Output "This is my string {SMILING FACE WITH OPEN MOUTH AND SMILING EYES}"

To date I'm using this code modified from the original code of this answer by MartinR, but it works only if I deal with a single character.

let myCharacter : Character = ""
let cfstr = NSMutableString(string: String(myCharacter)) as CFMutableString
var range = CFRangeMake(0, CFStringGetLength(cfstr))
CFStringTransform(cfstr, &range, kCFStringTransformToUnicodeName, Bool(0))
var newStr = "\(cfstr)"

// removing "\N"  from the result: \N{SMILING FACE WITH OPEN MOUTH AND SMILING EYES}
newStr = newStr.stringByReplacingOccurrencesOfString("\\N", withString:"")

print("\(newStr)") // {SMILING FACE WITH OPEN MOUTH AND SMILING EYES}

How can I achieve this?

Community
  • 1
  • 1
Cue
  • 2,952
  • 3
  • 33
  • 54

2 Answers2

7

Simply do not use a Character in the first place but use a String as input:

let cfstr = NSMutableString(string: "This  is my string ") as CFMutableString

that will finally output

This {SMILING FACE WITH OPEN MOUTH AND SMILING EYES} is my string {SMILING FACE WITH OPEN MOUTH AND SMILING EYES}

Put together:

func transformUnicode(input : String) -> String {
    let cfstr = NSMutableString(string: input) as CFMutableString
    var range = CFRangeMake(0, CFStringGetLength(cfstr))
    CFStringTransform(cfstr, &range, kCFStringTransformToUnicodeName, Bool(0))
    let newStr = "\(cfstr)"
    return newStr.stringByReplacingOccurrencesOfString("\\N", withString:"")
}

transformUnicode("This  is my string ")
luk2302
  • 55,258
  • 23
  • 97
  • 137
  • Thank you so much luk2302, is there a way to limit the conversion to emoji, avoiding to process also other characters like double quote etc? I would like to avoid, if possible, to replace for example “ with "LEFT DOUBLE QUOTATION MARK". – Cue Jun 11 '16 at 17:40
  • @Tel I do not think there is an easy way for that. It transforms all non-ASCII characters. The system does not know what exactly a smiley is, it is just a character as the name suggests. The fact that it *looks* like a smiley is just the human perception of the visual representation of the unicode. What you might be able to do is figure out what unicode charcode the smileys you wan to replace have and only convert those characters. But that will require some work on your end. – luk2302 Jun 11 '16 at 17:46
  • @EricD honestly I did not even look more into the actual performed transformation, I was surprised somethink like that was even possible. Is that alternative already mentioned in the linked question where OP's Code is from? You might want to add it there as well! – luk2302 Jun 11 '16 at 21:00
1

Here is a complete implementation.

It avoids to convert to description also the non-emoji characters (e.g. it avoids to convert to {LEFT DOUBLE QUOTATION MARK}). To accomplish this, it uses an extension based on this answer by Arnold that returns true or false whether a string contains an emoji.

The other part of the code is based on this answer by MartinR and the answer and comments to this answer by luk2302.

var str = "Hello World  …" // our string (with an emoji and a horizontal ellipsis)

let newStr = str.characters.reduce("") { // loop through str individual characters
    var item = "\($1)" // string with the current char
    let isEmoji = item.containsEmoji // true or false
    if isEmoji {
        item = item.stringByApplyingTransform(String(kCFStringTransformToUnicodeName), reverse: false)!
    }
    return $0 + item
}.stringByReplacingOccurrencesOfString("\\N", withString:"") // strips "\N"


extension String {
    var containsEmoji: Bool {
        for scalar in unicodeScalars {
            switch scalar.value {
            case 0x1F600...0x1F64F, // Emoticons
            0x1F300...0x1F5FF, // Misc Symbols and Pictographs
            0x1F680...0x1F6FF, // Transport and Map
            0x2600...0x26FF,   // Misc symbols
            0x2700...0x27BF,   // Dingbats
            0xFE00...0xFE0F,   // Variation Selectors
            0x1F900...0x1F9FF:   // Various (e.g. )
                return true
            default:
                continue
            }
        }
        return false
    }
}

print (newStr) // Hello World {SMILING FACE WITH OPEN MOUTH AND SMILING EYES} …

Please note that some emoji could not be included in the ranges of this code, so you should check if all the emoji are converted at the time you will implement the code.

Community
  • 1
  • 1
Cue
  • 2,952
  • 3
  • 33
  • 54
  • 1
    I have refactored the code a tiny bit: http://pastebin.com/qxtWuYNd Primarily the `reduce` is added, I removed the variables defined outside the closure but where used inside of it and I moved the replace outside of the closure. – luk2302 Jun 12 '16 at 07:41