I need to check if the given string is using Unicode emojis only (). What would be the possible solution?
-
Do you mean emojis, like what Google and Facebook use, or emoticons (plain text smile faces)? – onebree Jul 17 '15 at 19:36
-
@HunterStevens Hey! By emojis I mean Unicode ones. – Dan Jul 17 '15 at 19:41
-
Please edit the question and title to clarify that. The awesome answer below assumes you mean emoticons. – onebree Jul 17 '15 at 19:43
-
1@HunterStevens, thanks! I've updated the question. – Dan Jul 17 '15 at 19:52
-
This question covers the same topic https://stackoverflow.com/questions/24672834/how-do-i-remove-emoji-from-string – Besi Sep 05 '17 at 21:22
1 Answers
Matching a comprehensive set of emoji is difficult. You can either use a set of regular expressions to look for 'emoji-like' things, you can somewhere mark all emoji with special characters for start/end, or you can make a list of all possible emoji and match against it.You likely want to use some kind of regular expression.
If you want to use a regular expression, you can use the active record "match"
class Emoji < ActiveRecord::Base
validates :something, format: { with: /[:;][\)3\/|\(]/,
message: "only allows emojis" }
end
I might have escaped a few things there that ruby doesn't require escaping, but you get the idea. That regular expression will match one of :
or ;
and then one of )
, (
, /
, or |
which fit together to make a face. However lots of emoji can be more complicated. This next example compares the string with a list of valid emoji that you have in listOfEmoji
. It uses the inclusion validator.
class Emoji < ActiveRecord::Base
validates :something, inclusion: { in: listOfEmoji,
message: "%{value} is not in the list of valid emoji" }
end
Finally, you might escape the start and end of anything that earlier in the code you identified as emoji. You'd have to use something that wouldn't be included in an actual emoji to do this, and then match it as a regular expression. for example, if you replace :)
with emoji:)endemoji
you could validate it with a regular expression like this:
class Emoji < ActiveRecord::Base
validates :something, format: { with: /emoji.+endemoji/,
message: "only allows emojis" }
end
.+
is the regular expression for (any character) 1 or more times. so that would match a special signal that your code would wrap what it knows is an emoji with that you could later use to identify an emoji. There are better words to use than emoji, endemoji though. My favorite is the ascii 7, the character for a typewriter bell!
All of these are possibilities and the best answer depends a lot on how you want to build your code and what you're trying to do.
-
1I would suggest including to search for unicode characters, in case OP means emoji like pictures, not just plain text smiles. – onebree Jul 17 '15 at 19:37
-
please see [this comment](http://stackoverflow.com/questions/31483078/how-to-create-a-validation-in-rails-that-checks-if-a-string-contains-only-emoji?noredirect=1#comment50932153_31483078) – onebree Jul 17 '15 at 19:42
-
Hey, @JohnKulp, thanks for this comprehensive answer! I'm looking for a pattern that will match Unicode emojis, do you have any ideas about that? – Dan Jul 17 '15 at 19:48
-
2@dan for that you can use a very similar method. Instead of matching a specific word or list, you're finding the unicode value of the character and matching it against the ranges of emoji in unicode. This site has a full list of them: http://unicode.org/emoji/charts/full-emoji-list.html – John Kulp Jul 17 '15 at 20:12
-
1@dan, the specific syntax that you can use for that is: (string.ord >= 0x1F600 and string.ord <= U+1F606) and then skip 7, etc etc.... However thinking about it again, I wouldn't do this because it could take forever to code. Instead, I'd build a script to comb that page that I linked with the unicode emoji and make a JSON file with all of the unicode values that you care about. Afterwards you can load the file with rails and validate against it like in my second example. – John Kulp Jul 17 '15 at 20:18