The string in question is something like: Tomask Kassahun
How can I strip out the last emoticon/emoji (whatever it's called), so I just get Tomask Kassahun? Of course, it could also be any other emoticon like a rocket ship.
The string in question is something like: Tomask Kassahun
How can I strip out the last emoticon/emoji (whatever it's called), so I just get Tomask Kassahun? Of course, it could also be any other emoticon like a rocket ship.
As of Ruby 3.2.0, Ruby now supports a documented \p{Emoji}
character property specifically for Unicode emojis. This support was introduced in Onigmo 6.2.0 but was undocumented in Ruby core as recently as Ruby 3.1.2. However, this contains behavior that, while spec-conforming, will unexpectedly remove non-emoji characters, such as numbers, from a string. Thus, it is preferable to use the unfortunately-undocumented (as of this time) character property \p{Emoji_Presentation}
(shorthand \p{EPres}
). If your Ruby version and/or engine supports it, you can remove just emojis using the following examples.
"Tomask Kassahun ".gsub(/\p{Emoji_Presentation}/, '').strip
#=> "Tomask Kassahun"
"Tomask (mɑ̃ʒe) Kassahun ".gsub(/\p{Emoji_Presentation}/, '').strip
#=> "Tomask (mɑ̃ʒe) Kassahun"
If you are on an older Ruby or one that doesn't support the emoji character property, there are other properties that can also work well. I've described them below.
One possible approach is to strip out Unicode characters like using "Symbol: Other" from Ruby's character properties. For example:
"Tomask Kassahun ".gsub(/\p{So}/, '').strip
#=> "Tomask Kassahun"
This even works with strings containing accented characters. For example, borrowing some non-emoji accented characters from another post as a test case:
"Tomask (mɑ̃ʒe) Kassahun ".gsub(/\p{So}/, '').strip
#=> "Tomask (mɑ̃ʒe) Kassahun"
I think that it's a good case to use a regular expression, I'm not a regex expert but I think the following expression could be a good starting point.
str = "Tomask Kassahun "
Extract a substring passing an Element Reference, if a Regexp is supplied, the matching portion of the string is returned.
str[/^[a-zA-Z]+\s{1}[a-zA-Z]+/] #=> Tomask Kassahun
String match method returns an array
str.match(/^[a-zA-Z]+\s{1}[a-zA-Z]+/) #=> ['Tomask Kassahun']
You can pass the index
str.match(/^[a-zA-Z]+\s{1}[a-zA-Z]+/)[0] #=> Tomask Kassahun
Check https://ruby-doc.org/core-2.7.2/String.html#method-i-5B-5D