134

I need to find out whether a character in a string is an emoji.

For example, I have this character:

let string = ""
let character = Array(string)[0]

I need to find out if that character is an emoji.

ABakerSmith
  • 22,759
  • 9
  • 68
  • 78
Andrew
  • 7,693
  • 11
  • 43
  • 81
  • I am curious: why do you need that information? – Martin R Jun 10 '15 at 13:04
  • @EricD.: There are *many* Unicode characters which take more than one UTF-8 code point (e.g. "€" = E2 82 AC) or more than one UTF-16 code point (e.g. "" =D834 DD1E). – Martin R Jun 10 '15 at 13:30
  • 1
    Hope you will got an idea from this obj-c version of code http://stackoverflow.com/questions/19886642/check-if-there-is-an-emoji-contained-in-a-string – Ashish Kakkad Jun 10 '15 at 13:40
  • Strings have their indexing which is a preferred way of using them. To get a particular character (or grapheme cluster rather) you could: `let character = string[string.index(after: string.startIndex)]` or `let secondCharacter = string[string.index(string.startIndex, offsetBy: 1)]` – Paul B Sep 12 '19 at 12:32
  • Something I learned that's important here: There is a difference between "Is it an emoji?", and "Will this character be presented as an image or as text?" I think when many people say "Is this an emoji?" they *actually* what to know if it'll be presented as an image. As an example, the digit characters (e.g. "5") ARE emoji! BUT! ... they are presented as text by default. There are other emoji whose characters are text by default on iOS, but will almost definitely have an image variation selector if the user has input them, so will usually be shown in image form, but not by default. – Graham Lea Jul 19 '23 at 04:12

18 Answers18

324

What I stumbled upon is the difference between characters, unicode scalars and glyphs.

For example, the glyph ‍‍‍ consists of 7 unicode scalars:

  • Four emoji characters:
  • In between each emoji is a special character, which works like character glue; see the specs for more info

Another example, the glyph consists of 2 unicode scalars:

  • The regular emoji:
  • A skin tone modifier:

Last one, the glyph 1️⃣ contains three unicode characters:

So when rendering the characters, the resulting glyphs really matter.

Swift 5.0 and above makes this process much easier and gets rid of some guesswork we needed to do. Unicode.Scalar's new Property type helps is determine what we're dealing with. However, those properties only make sense when checking the other scalars within the glyph. This is why we'll be adding some convenience methods to the Character class to help us out.

For more detail, I wrote an article explaining how this works.

For Swift 5.0, this leaves you with the following result:

extension Character {
    /// A simple emoji is one scalar and presented to the user as an Emoji
    var isSimpleEmoji: Bool {
        guard let firstScalar = unicodeScalars.first else { return false }
        return firstScalar.properties.isEmoji && firstScalar.value > 0x238C
    }

    /// Checks if the scalars will be merged into an emoji
    var isCombinedIntoEmoji: Bool { unicodeScalars.count > 1 && unicodeScalars.first?.properties.isEmoji ?? false }

    var isEmoji: Bool { isSimpleEmoji || isCombinedIntoEmoji }
}

extension String {
    var isSingleEmoji: Bool { count == 1 && containsEmoji }

    var containsEmoji: Bool { contains { $0.isEmoji } }

    var containsOnlyEmoji: Bool { !isEmpty && !contains { !$0.isEmoji } }

    var emojiString: String { emojis.map { String($0) }.reduce("", +) }

    var emojis: [Character] { filter { $0.isEmoji } }

    var emojiScalars: [UnicodeScalar] { filter { $0.isEmoji }.flatMap { $0.unicodeScalars } }
}

Which will give you the following results:

"A̛͚̖".containsEmoji // false
"3".containsEmoji // false
"A̛͚̖▶️".unicodeScalars // [65, 795, 858, 790, 9654, 65039]
"A̛͚̖▶️".emojiScalars // [9654, 65039]
"3️⃣".isSingleEmoji // true
"3️⃣".emojiScalars // [51, 65039, 8419]
"".isSingleEmoji // true
"‍♂️".isSingleEmoji // true
"".isSingleEmoji // true
"⏰".isSingleEmoji // true
"".isSingleEmoji // true
"‍‍‍".isSingleEmoji // true
"".isSingleEmoji // true
"".containsOnlyEmoji // true
"‍‍‍".containsOnlyEmoji // true
"Hello ‍‍‍".containsOnlyEmoji // false
"Hello ‍‍‍".containsEmoji // true
" Héllo ‍‍‍".emojiString // "‍‍‍"
"‍‍‍".count // 1

" Héllœ ‍‍‍".emojiScalars // [128107, 128104, 8205, 128105, 8205, 128103, 8205, 128103]
" Héllœ ‍‍‍".emojis // ["", "‍‍‍"]
" Héllœ ‍‍‍".emojis.count // 2

"‍‍‍‍‍".isSingleEmoji // false
"‍‍‍‍‍".containsOnlyEmoji // true

For older Swift versions, check out this gist containing my old code.

Kevin R
  • 8,230
  • 4
  • 41
  • 46
  • 8
    This is by far the best and most correct answer here. Thank you! One small note, your examples don't match the code (you renamed containsOnlyEmoki to containsEmoji in the snippet - I presume because it's more correct, in my testing it returned true for strings with mixed characters). – Tim Bull Sep 29 '16 at 22:00
  • Thanks for pointing that out. I forgot to add some code to the example. I added the `containsOnlyEmoji` function. This one does check if the string only consists of emoji's or zero width joiner. – Kevin R Sep 30 '16 at 11:54
  • I'm getting an error on `count` under `containsOnlhEmoji`. Not sure what that value was supposed to be? – Andrew Oct 01 '16 at 01:17
  • 3
    My bad, I changed around some code, guess I messed up. I updated the example – Kevin R Oct 01 '16 at 10:36
  • @KevinR Awesome. I've changed this to the best answer. Would there be a way to get an array of emoji strings in a string/emoji-only string? – Andrew Oct 14 '16 at 08:21
  • 2
    @Andrew: Sure, I added another method to the example to demonstrate this :). – Kevin R Oct 14 '16 at 15:38
  • 1
    @KevinR Thanks. I think the result I'd be after would be that a string such as "‍‍‍" would be separated into separate glyphs, resulting in ["", "‍‍‍", ""]. Do you know if that's possible? – Andrew Oct 15 '16 at 18:14
  • 2
    @Andrew this is where it gets really messy. I added an example how to do that. The problem is I have assume to know how CoreText will render the the glyphs by simply checking the characters. If anyone has suggestions for a cleaner method please let me know. – Kevin R Oct 16 '16 at 06:56
  • 1
    @KevinR This is great. 1 issue i've noticed though, is that `containsOnlyEmoji` doesn't seem to work with some emoji, for example the one called 'smiling face' - ☺️. – Andrew Oct 28 '16 at 10:59
  • 3
    @Andrew Thanks for pointing that out, I changed the way `containsOnlyEmoji` checks. I also updated the example to Swift 3.0. – Kevin R Oct 29 '16 at 08:41
  • @KevinR So now that works, but as a result of calling `emojis` I now get ["☺️", ""], ie a blank second item. – Andrew Oct 31 '16 at 11:17
  • @KevinR I've noticed a number of problems here, trying to come up with a solution for this. On calling `emojis` with the smiling face emoji, it returns an array containing that emoji, plus a seemingly empty space. Comparing a string with the regular smiling face emoji to the first item in that given result returns false, on comparison. The empty-looking string has a character count of 1. And trying to fetch the range of the smiling face emoji from the `emojis` result, in the original given string, returns nil. – Andrew Nov 05 '16 at 11:13
  • Where does the enum values come from? – netdigger Mar 23 '17 at 16:27
  • This code does not work for newer emojis and those with diversity options. If you are using Objective C, you can use`enumerateSubstringsInRange` passing in `NSStringEnumerationByComposedCharacterSequences` as the enumeration option. – RunLoop Mar 29 '17 at 08:48
  • Hi @RunLoop; could you propose an edit to the answer to clarify that? I think a lot of people would benefit from that :). – Kevin R Mar 29 '17 at 10:07
  • Hi @KevinR Thanks for your original, excellent answer - I still use it to detect whether the string does in fact contain an emoji before enumerating the substrings. You are more than welcome to incorporate my addition to your answer. :) – RunLoop Mar 29 '17 at 10:18
  • 1
    I had problems trying to filter out emoji from a string. Here's using Kevin's answer - thanks @KevinR for your answer. `extension String { func stripEmoji() -> String { return self.unicodeScalars.filter({ $0.isEmoji == false }).map({ String($0) }).reduce("", +) } }` – xaphod Jun 02 '17 at 02:07
  • I notice containsOnlyEmoji() doesn't work in detecting the number emoji's 0️⃣ through 9️⃣. Any ideas for a fix? – vikzilla Jun 08 '17 at 14:08
  • As @RunLoop stated, this does not work for newer emojis and those with diversity options. Has anyone found a Swift solution for this? I'm stumped. – justColbs Jun 08 '17 at 23:57
  • @vikzilla This is because 0️⃣ is really 3 characters, a 'normal' `0`, a 'variation selector' (see: https://unicode-table.com/en/search/?q=65039) and a bounding box (see: https://unicode-table.com/en/search/?q=8419) since the first character is not a emoji. We'd have to use the superseding character to determine it's characteristics. I'm trying to find some time to add this, but feel free to suggest an edit:). – Kevin R Jun 09 '17 at 07:07
  • @KevinR It still returns the male version for some variation emojis. For example `["", "️‍♀", "️"]` is returned when calling `.emojis` on `"️‍♀️"` – justColbs Jun 10 '17 at 15:16
  • @justColbs The mentioned icon doesn't work on my iOS device, so I guess it's rather new? If you run this on playgrounds: `let s = "️‍♀️".unicodeScalars.map({$0})` and inspect `s`, you see the involved characters and get a sense of the problem. You can copy each value of the array onto https://unicode-table.com/ search bar and see what it means. This one seems quite complicated ;). – Kevin R Jun 13 '17 at 06:28
  • @KevinR Yeah I performed the same test and noticed that. It's mostly the complex variation emojis that it has trouble with. – justColbs Jun 14 '17 at 12:56
  • I think these ranges are subject to change as the Unicode standard changes. Or at least I haven't seen anything suggesting they stay constant. – sudo Oct 26 '17 at 18:31
  • 1
    @sudo certainly, as the spec expands, more ranges are added to these lists, thats one of the reasons these answers have different lists. If you catch any missing ranges, feel free to contribute :) – Kevin R Oct 27 '17 at 07:30
  • 1
    Is it just me or has a lot of above been broken with swift 4? characters.count only gives 1 now with swift 4. – Warpzit Oct 30 '17 at 12:43
  • 1
    There are still few more emojis which aren't recognized by the extension: ⏰ P.S. I've checked on the "/System/Library/PrivateFrameworks/CoreEmoji.framework/Resources/en.lproj/FindReplace.strings" which seems to contain text description for most of emoji (all of them?) – skyylex Jan 01 '18 at 19:43
  • Broken in Swift 4 – Anton Shkurenko May 11 '18 at 11:37
  • @AntonShkurenko seems to work fine for me, what seems to be the problem? – Kevin R Jul 03 '18 at 13:40
  • Just a small note, this implementation is very slow at compiling. Checking with `-Xfrontend -debug-time-function-bodies` it's around 1800ms (so nearly two seconds) – Claus Jørgensen Feb 20 '19 at 11:02
  • How do you deal with "3" and "#" which both evaluate to `true` for `isEmoji` – Miniroo Aug 07 '19 at 23:00
  • 3
    I added also the comparison: `$0.properties.generalCategory == .otherSymbol` to make it work for more emojis, like ⏰, , etc – vicegax Oct 31 '19 at 12:45
  • 1
    Thanks for this code! I noticed that flags are not detected has emojis because they are a combination of emojis without join control so I changed `isSimpleEmoji` implementation to be `unicodeScalars.allSatisfy({ $0.properties.isEmojiPresentation })`. – Sparga Dec 10 '19 at 19:34
  • @Sparga Thanks! I added your check to `isCombinedIntoEmoji`, also because some emoji like '' would break otherwise. – Kevin R Dec 11 '19 at 06:55
  • @KevinR Thanks for the great answer! I found another case when `isCombinedIntoEmoji` falls short, that is country subdivision flags like ,. I added another condition for this case which is `... || unicodeScalars.first!.properties.isEmojiPresentation && unicodeScalars.dropFirst().allSatisfy { $0.properties.generalCategory == .format }` You can learn more on how these emojis are constructed here, see scotland flag example: https://blog.emojipedia.org/emoji-flags-explained/ – Yuriy Pavlyshak Jan 22 '20 at 11:28
  • @YuriyPavlyshak thanks! Sorry it took me a while to get around to it, but I updated the code! – Kevin R Mar 24 '20 at 14:41
  • NOTE that: 'isEmoji' is only available in iOS 10.2 or newer – boog Jun 12 '20 at 10:57
  • 1
    It looks that ⌚ is not marked as an emoji. I tested this emoji set: https://stackoverflow.com/a/60565823/1054550 – goodliving Jun 17 '21 at 17:52
  • the first unicode scalar of "⌚" is `0x231A` which is smaller than `0x238C` – Joey Aug 15 '22 at 05:25
63

The simplest, cleanest, and swiftiest way to accomplish this is to simply check the Unicode code points for each character in the string against known emoji and dingbats ranges, like so:

extension String {

    var containsEmoji: Bool {
        for scalar in unicodeScalars {
            switch scalar.value {
            case 0x1F600...0x1F64F, // Emoticons
                 0x1F300...0x1F5FF, // Misc Symbols and Pictographs
                 0x1F680...0x1F6FF, // Transport and Map
                 0x2600...0x26FF,   // Misc symbols
                 0x2700...0x27BF,   // Dingbats
                 0xFE00...0xFE0F,   // Variation Selectors
                 0x1F900...0x1F9FF, // Supplemental Symbols and Pictographs
                 0x1F1E6...0x1F1FF: // Flags
                return true
            default:
                continue
            }
        }
        return false
    }

}
Arnold
  • 2,390
  • 1
  • 26
  • 45
  • 11
    A code example like this is way better than suggesting to include a third party library dependency. Shardul's answer is unwise advice to follow—always write your own code. – thefaj Mar 31 '16 at 23:24
  • This is great, thank you for commenting what the cases pertain to – Shawn Throop Apr 29 '16 at 09:20
  • 2
    Like so much your code, I implemented it in an answer [here](http://stackoverflow.com/questions/37766611/how-to-replace-emoji-characters-with-their-descriptions-in-a-swift-string). A thing I noticed is that it miss some emoji, maybe because they are not part of the categories you listed, for example this one: Robot Face emoji – Cue Jun 13 '16 at 19:24
  • 1
    @Tel I guess it would be the range `0x1F900...0x1F9FF` (per Wikipedia). Not sure all of the range should be considered emoji. – Frizlab Aug 14 '16 at 13:53
  • Thank you. This is great. Is there an update range of new emojis? – jonchoi May 25 '22 at 21:02
24

Swift 5.0

… introduced a new way of checking exactly this!

You have to break your String into its Scalars. Each Scalar has a Property value which supports the isEmoji value!

Actually you can even check if the Scalar is a Emoji modifier or more. Check out Apple's documentation: https://developer.apple.com/documentation/swift/unicode/scalar/properties

You may want to consider checking for isEmojiPresentation instead of isEmoji, because Apple states the following for isEmoji:

This property is true for scalars that are rendered as emoji by default and also for scalars that have a non-default emoji rendering when followed by U+FE0F VARIATION SELECTOR-16. This includes some scalars that are not typically considered to be emoji.


This way actually splits up Emoji's into all the modifiers, but it is way simpler to handle. And as Swift now counts Emoji's with modifiers (e.g.: ‍‍‍, ‍, ) as 1 you can do all kind of stuff.

var string = " test"

for scalar in string.unicodeScalars {
    let isEmoji = scalar.properties.isEmoji

    print("\(scalar.description) \(isEmoji)")
}

//  true
//   false
// t false
// e false
// s false
// t false

NSHipster points out an interesting way to get all Emoji's:

import Foundation

var emoji = CharacterSet()

for codePoint in 0x0000...0x1F0000 {
    guard let scalarValue = Unicode.Scalar(codePoint) else {
        continue
    }

    // Implemented in Swift 5 (SE-0221)
    // https://github.com/apple/swift-evolution/blob/master/proposals/0221-character-properties.md
    if scalarValue.properties.isEmoji {
        emoji.insert(scalarValue)
    }
}
Alexander Khitev
  • 6,417
  • 13
  • 59
  • 115
alexkaessner
  • 1,966
  • 1
  • 14
  • 39
  • 1
    Great answer, thanks. It's worth mentioning that your min sdk must be 10.2 to use this part of Swift 5. Also in order to check if a string was only made up of emojis I had to check if it had one of these properties: `scalar.properties.isEmoji scalar.properties.isEmojiPresentation scalar.properties.isEmojiModifier scalar.properties.isEmojiModifierBase scalar.properties.isJoinControl scalar.properties.isVariationSelector` – A Springham Jul 16 '19 at 09:26
  • 10
    Beware, integers 0-9 are considered emojis. So `"6".unicodeScalars.first!.properties.isEmoji` will evaluate as `true` – Miniroo Aug 06 '19 at 20:01
  • 2
    There are other characters like `#` and `*` that will also return true for the `isEmoji` check. `isEmojiPresentation` seems to work better, at least it returns `false` for `0...9`, `#`, `*` and any other symbol I could try on an English-US keyboard. Anyone has more experience with it and knows if it can be trusted for input validation? – Jan Jan 03 '21 at 18:14
  • 3
    ❤️ has two scalars. First scalar's `isEmoji` is `true`, but `isEmojiPresentation` is `false`. Second scalar's will only return `true` for `isVariationSelector`. So doesn't look like a straight forward way to understand what's an emoji – zh. Apr 15 '21 at 10:46
  • Why does your code point loop top out at `0x1F0000`? The highest legal Unicode code point (scalar) value is `0x10FFFF`. So in the above loop the `guard` statement and its unsuccessful attempts to construct a Unicode.Scaler() is continuing the loop unnecessarily 917,505 times. Or perhaps you meant `break` rather than `continue`. What am I missing? – jsbox May 03 '22 at 19:21
12

With Swift 5 you can now inspect the unicode properties of each character in your string. This gives us the convenient isEmoji variable on each letter. The problem is isEmoji will return true for any character that can be converted into a 2-byte emoji, such as 0-9.

We can look at the variable isEmoji and also check the for the presence of an emoji modifier to determine if the ambiguous characters will display as an emoji.

This solution should be much more future proof than the regex solutions offered here.

extension String {
    func containsEmoji() -> Bool {
        contains { $0.isEmoji }
    }

    func containsOnlyEmojis() -> Bool {
        return count > 0 && !contains { !$0.isEmoji }
    }
}

extension Character {
    // An emoji can either be a 2 byte unicode character or a normal UTF8 character with an emoji modifier
    // appended as is the case with 3️⃣. 0x203C is the first instance of UTF16 emoji that requires no modifier.
    // `isEmoji` will evaluate to true for any character that can be turned into an emoji by adding a modifier
    // such as the digit "3". To avoid this we confirm that any character below 0x203C has an emoji modifier attached
    var isEmoji: Bool {
        guard let scalar = unicodeScalars.first else { return false }
        return scalar.properties.isEmoji && (scalar.value >= 0x203C || unicodeScalars.count > 1)
    }
}

Giving us

"hey".containsEmoji() //false

"Hello World ".containsEmoji() //true
"Hello World ".containsOnlyEmojis() //false

"3".containsEmoji() //false
"3️⃣".containsEmoji() //true
Marián Černý
  • 15,096
  • 4
  • 70
  • 83
Miniroo
  • 415
  • 5
  • 13
  • 1
    And what's more is `Character("3️⃣").isEmoji // true` while `Character("3").isEmoji // false` – Paul B Sep 12 '19 at 13:01
  • I think the first UTF16 emoji without modifier is 0x203C (double exclamation mark) and not 0x238C. I also think you should be comparing with >= and not >. – Marián Černý Nov 15 '22 at 20:50
8
extension String {
    func containsEmoji() -> Bool {
        for scalar in unicodeScalars {
            switch scalar.value {
            case 0x3030, 0x00AE, 0x00A9,// Special Characters
            0x1D000...0x1F77F,          // Emoticons
            0x2100...0x27BF,            // Misc symbols and Dingbats
            0xFE00...0xFE0F,            // Variation Selectors
            0x1F900...0x1F9FF:          // Supplemental Symbols and Pictographs
                return true
            default:
                continue
            }
        }
        return false
    }
}

This is my fix, with updated ranges.

7

Swift 5 solution using Scalars that works on text, smiley faces , heart emoji ❤️❤️‍ and numbers 0️⃣ 1 2 3 etc

Swift 5 Scalars have isEmoji and isEmojiPresentation properties that will help to find emoji in particular String.

isEmoji - Boolean value indicating whether the scalar has an emoji presentation, whether or not it is the default.

isEmojiPresentation - A Boolean value indicating whether the scalar is one that should be rendered with an emoji presentation, rather than a text presentation, by default.

As you can see by these definitions, we cannot just use isEmoji or isEmojiPresentation on scalars of the string - this will not tell us whether this scalar is really an emoji

Luckily Apple gave us a clue:

testing isEmoji alone on a single scalar is insufficient to determine if a unit of text is rendered as an emoji; a correct test requires inspecting multiple scalars in a Character. In addition to checking whether the base scalar has isEmoji == true, you must also check its default presentation (see isEmojiPresentation) and determine whether it is followed by a variation selector that would modify the presentation.

So finally here is my implementation that works on numbers, smiley faces , text and ❤️ symbols:

import Foundation

extension String {

    func containsEmoji() -> Bool {
        
        for character in self {
            var shouldCheckNextScalar = false
            for scalar in character.unicodeScalars {
               if shouldCheckNextScalar {
                    if scalar == "\u{FE0F}" { // scalar that indicates that character should be displayed as emoji
                        return true
                    }
                    shouldCheckNextScalar = false
                }
                
                if scalar.properties.isEmoji {
                    if scalar.properties.isEmojiPresentation {
                        return true
                    }
                    shouldCheckNextScalar = true
                }
            }
        }
        
        return false
    }
    
}

Tests:

"hello ❤️".containsEmoji()   // true
"1234567890".containsEmoji() // false
"numero 0️⃣".containsEmoji()  // true
"abcde".containsEmoji()      // false
"panda ".containsEmoji()   // true
Stacy Smith
  • 490
  • 5
  • 11
  • I think this solution is close to ideal, but shouldn't you also be checking that the text variation character (`\u{FE0E}`) *isn't* in the scalars? (Which would also mean that your first test case would be `false`, because that heart I see is the non-emoji version of ❤️ ?) – Graham Lea Jul 19 '23 at 04:35
4

Swift 3 Note:

It appears the cnui_containsEmojiCharacters method has either been removed or moved to a different dynamic library. _containsEmoji should still work though.

let str: NSString = "hello"

@objc protocol NSStringPrivate {
    func _containsEmoji() -> ObjCBool
}

let strPrivate = unsafeBitCast(str, to: NSStringPrivate.self)
strPrivate._containsEmoji() // true
str.value(forKey: "_containsEmoji") // 1


let swiftStr = "hello"
(swiftStr as AnyObject).value(forKey: "_containsEmoji") // 1

Swift 2.x:

I recently discovered a private API on NSString which exposes functionality for detecting if a string contains an Emoji character:

let str: NSString = "hello"

With an objc protocol and unsafeBitCast:

@objc protocol NSStringPrivate {
    func cnui_containsEmojiCharacters() -> ObjCBool
    func _containsEmoji() -> ObjCBool
}

let strPrivate = unsafeBitCast(str, NSStringPrivate.self)
strPrivate.cnui_containsEmojiCharacters() // true
strPrivate._containsEmoji() // true

With valueForKey:

str.valueForKey("cnui_containsEmojiCharacters") // 1
str.valueForKey("_containsEmoji") // 1

With a pure Swift string, you must cast the string as AnyObject before using valueForKey:

let str = "hello"

(str as AnyObject).valueForKey("cnui_containsEmojiCharacters") // 1
(str as AnyObject).valueForKey("_containsEmoji") // 1

Methods found in the NSString header file.

JAL
  • 41,701
  • 23
  • 172
  • 300
4

There is a nice solution for the mentioned task. But Checking Unicode.Scalar.Properties of unicode scalars is good for a single Character. And not flexible enough for Strings.

We can use Regular Expressions instead — more universal approach. There is a detailed description of how it works below. And here goes the solution.

The Solution

In Swift you can check, whether a String is a single Emoji character, using an extension with such a computed property:

extension String {

    var isSingleEmoji : Bool {
        if self.count == 1 {
            let emodjiGlyphPattern = "\\p{RI}{2}|(\\p{Emoji}(\\p{EMod}|\\x{FE0F}\\x{20E3}?|[\\x{E0020}-\\x{E007E}]+\\x{E007F})|[\\p{Emoji}&&\\p{Other_symbol}])(\\x{200D}(\\p{Emoji}(\\p{EMod}|\\x{FE0F}\\x{20E3}?|[\\x{E0020}-\\x{E007E}]+\\x{E007F})|[\\p{Emoji}&&\\p{Other_symbol}]))*"

            let fullRange = NSRange(location: 0, length: self.utf16.count)
            if let regex = try? NSRegularExpression(pattern: emodjiGlyphPattern, options: .caseInsensitive) {
                let regMatches = regex.matches(in: self, options: NSRegularExpression.MatchingOptions(), range: fullRange)
                if regMatches.count > 0 {
                    // if any range found — it means, that that single character is emoji
                    return true
                }
            }
        }
        return false
    }

}

How it works (in details)

A single Emoji (a glyph) can be reproduced by a number of different symbols, sequences and their combinations. Unicode specification defines several possible Emoji character representations.

Single-Character Emoji

An Emoji character reproduced by a single Unicode Scalar.

Unicode defines Emoji Character as:

emoji_character := \p{Emoji}

But it doesn’t necessarily mean that such a character will be drawn as an Emoji. An ordinary numeric symbol “1” has Emoji property being true, though it still might be drawn as text. And there is a list of such symbols: #, ©, 4, etc.

One should think, that we can use additional property to check: “Emoji_Presentation”. But it doesn’t work like this. There is an Emoji like or , which have property Emoji_Presentation=false.

To make sure, that the character is drawn as Emoji by default, we should check its category: it should be “Other_symbol”.

So, in fact regular expression for Single-Character Emoji should be defined as:

emoji_character := \p{Emoji}&&\p{Other_symbol}

Emoji Presentation Sequence

A character, which normally can be drawn as either text or as Emoji. It’s appearance depends on a special following symbol, a presentation selector, which indicates its presentation type. \x{FE0E} defines text representation. \x{FE0F} defines emoji representation.

The list of such symbols can be found [here](
https://unicode.org/Public/emoji/12.1/emoji-variation-sequences.txt).

Unicode defines presentation sequence like this:

emoji_presentation_sequence := emoji_character emoji_presentation_selector

Regular expression sequence for it:

emoji_presentation_sequence := \p{Emoji} \x{FE0F}

Emoji Keycap Sequence

The sequence looks very alike with Presentation sequence, but it has additional scalar at the end: \x{20E3}. The scope of possible base scalars used for it is rather narrow: 0-9#* — and that’s all. Examples: 1️⃣, 8️⃣, *️⃣.

Unicode defines keycap sequence like this:

emoji_keycap_sequence := [0-9#*] \x{FE0F 20E3}

Regular expression for it:

emoji_keycap_sequence := \p{Emoji} \x{FE0F} \x{FE0F}

Emoji Modifier Sequence

Some Emojis can have modified appearance like a skin tone. For example Emoji can be different: . To define an Emoji, which is called “Emoji_Modifier_Base” in this case, one can use a subsequent “Emoji_Modifier”.

In general such sequence looks like this:

emoji_modifier_sequence := emoji_modifier_base emoji_modifier

To detect it we can search for a regular expression sequence:

emoji_modifier_sequence := \p{Emoji} \p{EMod}

Emoji Flag Sequence

Flags are Emojis with their particular structure. Each flag is represented with two “Regional_Indicator” symbols.

Unicode defines them like:

emoji_flag_sequence := regional_indicator regional_indicator

For example flag of Ukraine in fact is represented with two scalars: \u{0001F1FA \u{0001F1E6}

Regular expression for it:

emoji_flag_sequence := \p{RI}{2}

Emoji Tag Sequence (ETS)

A sequence which uses a so-called tag_base, which is followed by a custom tag specification composed from range of symbols \x{E0020}-\x{E007E} and concluded by tag_end mark \x{E007F}.

Unicode defines it like this:

emoji_tag_sequence := tag_base tag_spec tag_end
tag_base           := emoji_character
                    | emoji_modifier_sequence
                    | emoji_presentation_sequence
tag_spec           := [\x{E0020}-\x{E007E}]+
tag_end            := \x{E007F}

Strange thing is that Unicode allows tag to be based on emoji_modifier_sequence or emoji_presentation_sequence in ED-14a. But at the same time in regular expressions provided at the same documentation they seem to check the sequence based on a single Emoji character only.

In Unicode 12.1 Emoji list there are only three such Emojis defined. All of them are flags of the UK countries: England , Scotland and Wales . And all of them are based on a single Emoji character. So, we’d better check for such a sequence only.

Regular expression:

\p{Emoji} [\x{E0020}-\x{E007E}]+ \x{E007F}

Emoji Zero-Width Joiner Sequence (ZWJ sequence)

A zero-width joiner is a scalar \x{200D}. With its help several characters, which are already Emojis by themselves, can be combined into new ones.

For a example a “family with father, son and daughter” Emoji ‍‍ is reproduced by a combination of father , daughter and son Emojis glued together with ZWJ symbols.

It is allowed to stick together elements, which are Single Emoji characters, Presentation and Modifier sequences.

Regular expression for such sequence in general looks like this:

emoji_zwj_sequence := emoji_zwj_element (\x{200d} emoji_zwj_element )+

Regular Expression For All Of Them

All of the mentioned above Emoji representations can be described by a single regular expression:

\p{RI}{2}
| ( \p{Emoji} 
    ( \p{EMod} 
    | \x{FE0F}\x{20E3}? 
    | [\x{E0020}-\x{E007E}]+\x{E007F} 
    ) 
  | 
[\p{Emoji}&&\p{Other_symbol}] 
  )
  ( \x{200D}
    ( \p{Emoji} 
      ( \p{EMod} 
      | \x{FE0F}\x{20E3}? 
      | [\x{E0020}-\x{E007E}]+\x{E007F} 
      ) 
    | [\p{Emoji}&&\p{Other_symbol}] 
    ) 
  )*
Dmytro Babych
  • 270
  • 2
  • 7
3

You can use this code example or this pod.

To use it in Swift, import the category into the YourProject_Bridging_Header

#import "NSString+EMOEmoji.h"

Then you can check the range for every emoji in your String:

let example: NSString = "string‍‍‍withemojis✊" //string with emojis

let containsEmoji: Bool = example.emo_containsEmoji()

    print(containsEmoji)

// Output: ["true"]

I created an small example project with the code above.

Gabriel.Massana
  • 8,165
  • 6
  • 62
  • 81
3

Future Proof: Manually check the character's pixels; the other solutions will break (and have broken) as new emojis are added.

Note: This is Objective-C (can be converted to Swift)

Over the years these emoji-detecting solutions keep breaking as Apple adds new emojis w/ new methods (like skin-toned emojis built by pre-cursing a character with an additional character), etc.

I finally broke down and just wrote the following method which works for all current emojis and should work for all future emojis.

The solution creates a UILabel with the character and a black background. CG then takes a snapshot of the label and I scan all pixels in the snapshot for any non solid-black pixels. The reason I add the black background is to avoid issues of false-coloring due to Subpixel Rendering

The solution runs VERY fast on my device, I can check hundreds of characters a second, but it should be noted that this is a CoreGraphics solution and should not be used heavily like you could with a regular text method. Graphics processing is data heavy so checking thousands of characters at once could result in noticeable lag.

-(BOOL)isEmoji:(NSString *)character {
    
    UILabel *characterRender = [[UILabel alloc] initWithFrame:CGRectMake(0, 0, 1, 1)];
    characterRender.text = character;
    characterRender.font = [UIFont fontWithName:@"AppleColorEmoji" size:12.0f];//Note: Size 12 font is likely not crucial for this and the detector will probably still work at an even smaller font size, so if you needed to speed this checker up for serious performance you may test lowering this to a font size like 6.0
    characterRender.backgroundColor = [UIColor blackColor];//needed to remove subpixel rendering colors
    [characterRender sizeToFit];
    
    CGRect rect = [characterRender bounds];
    UIGraphicsBeginImageContextWithOptions(rect.size,YES,0.0f);
    CGContextRef contextSnap = UIGraphicsGetCurrentContext();
    [characterRender.layer renderInContext:contextSnap];
    UIImage *capturedImage = UIGraphicsGetImageFromCurrentImageContext();
    UIGraphicsEndImageContext();
    
    CGImageRef imageRef = [capturedImage CGImage];
    NSUInteger width = CGImageGetWidth(imageRef);
    NSUInteger height = CGImageGetHeight(imageRef);
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
    unsigned char *rawData = (unsigned char*) calloc(height * width * 4, sizeof(unsigned char));
    NSUInteger bytesPerPixel = 4;//Note: Alpha Channel not really needed, if you need to speed this up for serious performance you can refactor this pixel scanner to just RGB
    NSUInteger bytesPerRow = bytesPerPixel * width;
    NSUInteger bitsPerComponent = 8;
    CGContextRef context = CGBitmapContextCreate(rawData, width, height,
                                                 bitsPerComponent, bytesPerRow, colorSpace,
                                                 kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
    CGColorSpaceRelease(colorSpace);
    
    CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
    CGContextRelease(context);
    
    BOOL colorPixelFound = NO;
    
    int x = 0;
    int y = 0;
    while (y < height && !colorPixelFound) {
        while (x < width && !colorPixelFound) {
            
            NSUInteger byteIndex = (bytesPerRow * y) + x * bytesPerPixel;
            
            CGFloat red = (CGFloat)rawData[byteIndex];
            CGFloat green = (CGFloat)rawData[byteIndex+1];
            CGFloat blue = (CGFloat)rawData[byteIndex+2];
            
            CGFloat h, s, b, a;
            UIColor *c = [UIColor colorWithRed:red green:green blue:blue alpha:1.0f];
            [c getHue:&h saturation:&s brightness:&b alpha:&a];//Note: I wrote this method years ago, can't remember why I check HSB instead of just checking r,g,b==0; Upon further review this step might not be needed, but I haven't tested to confirm yet. 
            
            b /= 255.0f;
            
            if (b > 0) {
                colorPixelFound = YES;
            }
            
            x++;
        }
        x=0;
        y++;
    }
    
    return colorPixelFound;
    
}
Community
  • 1
  • 1
Albert Renshaw
  • 17,282
  • 18
  • 107
  • 195
  • 6
    I like your thinking! ;) - Out of the box! – Ramon Aug 15 '18 at 11:56
  • Why are you doing this to us? #apple #unicodestandard – d4Rk Feb 26 '19 at 15:48
  • I haven't looked at this in a while but I wonder if I have to convert to UIColor then to hsb; it seems I can just check that r,g,b all == 0? If someone tries let me know – Albert Renshaw Feb 26 '19 at 18:24
  • i like this solution, but won't it break with a character like ℹ ? – Juan Carlos Ospina Gonzalez May 16 '19 at 14:25
  • 1
    @JuanCarlosOspinaGonzalez Nope, in emoji that renders as a blue box with a white i. It does bring up a good point though that the UILabel should force the font to be `AppleColorEmoji`, adding that in now as a fail safe, although I think Apple will default it for those anyways – Albert Renshaw May 16 '19 at 20:09
2

For Swift 3.0.2, the following answer is the simplest one:

class func stringContainsEmoji (string : NSString) -> Bool
{
    var returnValue: Bool = false

    string.enumerateSubstrings(in: NSMakeRange(0, (string as NSString).length), options: NSString.EnumerationOptions.byComposedCharacterSequences) { (substring, substringRange, enclosingRange, stop) -> () in

        let objCString:NSString = NSString(string:substring!)
        let hs: unichar = objCString.character(at: 0)
        if 0xd800 <= hs && hs <= 0xdbff
        {
            if objCString.length > 1
            {
                let ls: unichar = objCString.character(at: 1)
                let step1: Int = Int((hs - 0xd800) * 0x400)
                let step2: Int = Int(ls - 0xdc00)
                let uc: Int = Int(step1 + step2 + 0x10000)

                if 0x1d000 <= uc && uc <= 0x1f77f
                {
                    returnValue = true
                }
            }
        }
        else if objCString.length > 1
        {
            let ls: unichar = objCString.character(at: 1)
            if ls == 0x20e3
            {
                returnValue = true
            }
        }
        else
        {
            if 0x2100 <= hs && hs <= 0x27ff
            {
                returnValue = true
            }
            else if 0x2b05 <= hs && hs <= 0x2b07
            {
                returnValue = true
            }
            else if 0x2934 <= hs && hs <= 0x2935
            {
                returnValue = true
            }
            else if 0x3297 <= hs && hs <= 0x3299
            {
                returnValue = true
            }
            else if hs == 0xa9 || hs == 0xae || hs == 0x303d || hs == 0x3030 || hs == 0x2b55 || hs == 0x2b1c || hs == 0x2b1b || hs == 0x2b50
            {
                returnValue = true
            }
        }
    }

    return returnValue;
}
Ankit Goyal
  • 3,019
  • 1
  • 21
  • 26
2

The absolutely similar answer to those that wrote before me, but with updated set of emoji scalars.

extension String {
    func isContainEmoji() -> Bool {
        let isContain = unicodeScalars.first(where: { $0.isEmoji }) != nil
        return isContain
    }
}


extension UnicodeScalar {

    var isEmoji: Bool {
        switch value {
        case 0x1F600...0x1F64F,
             0x1F300...0x1F5FF,
             0x1F680...0x1F6FF,
             0x1F1E6...0x1F1FF,
             0x2600...0x26FF,
             0x2700...0x27BF,
             0xFE00...0xFE0F,
             0x1F900...0x1F9FF,
             65024...65039,
             8400...8447,
             9100...9300,
             127000...127600:
            return true
        default:
            return false
        }
    }

}
1

You can use NSString-RemoveEmoji like this:

if string.isIncludingEmoji {

}
jood
  • 2,188
  • 2
  • 21
  • 32
Shardul
  • 4,266
  • 3
  • 32
  • 50
1

@StacySmith's answer worked great for me, just wanted to share my own version of it, since all the cool kids are doing it:

extension String.Element {
    var isEmoji: Bool {
        var shouldCheckNextScalar = false
        return unicodeScalars.contains { scalar in
            if shouldCheckNextScalar {
                if scalar == "\u{FE0F}" { // scalar that indicates that character should be displayed as emoji
                    return true
                }
                shouldCheckNextScalar = false
            }

            if scalar.properties.isEmoji {
                if scalar.properties.isEmojiPresentation {
                    return true
                }
                shouldCheckNextScalar = true
            }

            return false
        }
    }
}

extension String {
    var emojiCount: Int {
        reduce(0) { partialResult, character in
            partialResult + (character.isEmoji ? 1 : 0)
        }
    }
}

let test = " hello world ‍‍  ‍♀️ ❤️ 12345"
let count = test.emojiCount // 5
Ruben Martinez Jr.
  • 3,199
  • 5
  • 42
  • 76
0

Use the following extensions,

extension Character {
    var isSimpleEmoji: Bool {
        guard let firstScalar = unicodeScalars.first else {
            return false
        }
        return firstScalar.properties.isEmoji && firstScalar.value > 0x238C
    }  

    var isCombinedIntoEmoji: Bool {
        unicodeScalars.count > 1 && unicodeScalars.first?.properties.isEmoji ?? false
    }

    var isEmoji: Bool { isSimpleEmoji || isCombinedIntoEmoji }
}

extension String {
    var containsEmoji: Bool {
        contains(where: { $0.isEmoji })
    }
}

How to use

let str = ""
print(str.containsEmoji) // true

Original answer by reference.

Sreekuttan
  • 1,579
  • 13
  • 19
  • 1
    from [the docs](https://developer.apple.com/documentation/swift/unicode/scalar/properties/3081577-isemoji): "testing isEmoji alone on a single scalar is insufficient to determine if a unit of text is rendered as an emoji; a correct test requires inspecting multiple scalars in a Character. In addition to checking whether the base scalar has isEmoji == true, you must also check its default presentation (see isEmojiPresentation) and determine whether it is followed by a variation selector that would modify the presentation." – humblehacker Dec 04 '21 at 19:26
  • I have made the changes accordingly @humblehacker – Sreekuttan Dec 05 '21 at 13:07
0
extension String {
    // Returns false for if string contains characters like "Á‍‍1️⃣"
    var hasRestrictedCharacter: Bool {
        contains { !$0.isASCII }
    }
}

let testChars = " d1/Á‍‍1️⃣"

for char in testChars {
    let value = "\(char)".hasRestrictedCharacter
    print("\(char) : \(value)")
}

//  : false
//d : false
//1 : false
/// : false
//Á : true
// : true
//‍‍ : true
// : true
//1️⃣ : true
-1

i had the same problem and ended up making a String and Character extensions.

The code is too long to post as it actually lists all emojis (from the official unicode list v5.0) in a CharacterSet you can find it here:

https://github.com/piterwilson/StringEmoji

Constants

let emojiCharacterSet: CharacterSet

Character set containing all known emoji (as described in official Unicode List 5.0 http://unicode.org/emoji/charts-5.0/emoji-list.html)

String

var isEmoji: Bool { get }

Whether or not the String instance represents a known single Emoji character

print("".isEmoji) // false
print("".isEmoji) // true
print("".isEmoji) // false (String is not a single Emoji)
var containsEmoji: Bool { get }

Whether or not the String instance contains a known Emoji character

print("".containsEmoji) // false
print("".containsEmoji) // true
print("".containsEmoji) // true
var unicodeName: String { get }

Applies a kCFStringTransformToUnicodeName - CFStringTransform on a copy of the String

print("á".unicodeName) // \N{LATIN SMALL LETTER A WITH ACUTE}
print("".unicodeName) // "\N{FACE WITH STUCK-OUT TONGUE AND WINKING EYE}"
var niceUnicodeName: String { get }

Returns the result of a kCFStringTransformToUnicodeName - CFStringTransform with \N{ prefixes and } suffixes removed

print("á".unicodeName) // LATIN SMALL LETTER A WITH ACUTE
print("".unicodeName) // FACE WITH STUCK-OUT TONGUE AND WINKING EYE

Character

var isEmoji: Bool { get }

Whether or not the Character instance represents a known Emoji character

print("".isEmoji) // false
print("".isEmoji) // true
-1

Native one line code

"❤️".unicodeScalars.contains { $0.properties.isEmoji } // true

Works from Swift 5.0

Mojtaba Hosseini
  • 95,414
  • 31
  • 268
  • 278