I need to somehow remove all characters except emoji from a string in Python.
Really needed an answer to this, made this myself. Hope someone needs it. This quesiton is a QnA and it requires no further context but you're free to add your answers.
I need to somehow remove all characters except emoji from a string in Python.
Really needed an answer to this, made this myself. Hope someone needs it. This quesiton is a QnA and it requires no further context but you're free to add your answers.
An alternative approach, that should support complex graphemes:
a) a solution using the emoji
and regex
modules:
import emoji
import regex as re
text = "©️ 1️⃣ Hello, world! from in ☝"
graphemes = re.findall(r'\X', text)
result = "".join([grapheme for grapheme in graphemes if emoji.is_emoji(grapheme)])
print(result)
# ©️1️⃣☝
b) just using the regex
module:
import regex as re
text = "©️ 1️⃣ Hello, world! from in ☝"
graphemes = re.findall(r'\X', text)
result = "".join([grapheme for grapheme in graphemes if re.match(r'^\p{Emoji}(\uFE0F\u20E3?|[\p{Emoji}\u200D])*$', grapheme)])
print(result)
# ©️1️⃣☝
Solution 1: You can use emoji package to extract emojis as shown below.
import emoji
text = "©️ 1️⃣ Hello, world! from in ☝"
print("".join(_['emoji'] for _ in emoji.emoji_list(text)))
Output
©️1️⃣☝
Solution 2: You can use deomji package to extract emojis as shown below. (This solution doesn't maintain order of emojis for given text, you can use findall_list to maintain order with demoji)
import demoji
text = "©️ 1️⃣ Hello, world! from in ☝"
print(demoji.findall(text).keys())
Output
©️☝1️⃣
Here's a working solution although it's not ideal.
It checks for English (can be changed) text and some other characters in a string and removes them.
There's also a check for if it only contains text. You can add other characters to r'[a-zA-Z0-9()_/., ]*$'
to exclude them aswell.
import re
input_string = " test ♣️ ⚾️ test ⤵️ ⛎"
m = re.compile(r'[a-zA-Z0-9()_/., ]*$')
if m.match(input_string):
print("Uh oh, contains text only")
else:
blankPrompt = []
for i in input_string:
print(i)
doAppend = True
if re.search(r'[^a-zA-Z0-9()_/., ]', i):
print("safe")
else:
print("unsafe")
doAppend = False
if doAppend:
blankPrompt.append(i)
beautifiedPrompt = ''.join(str(p) for p in blankPrompt)
print(beautifiedPrompt)
install the package emoji with pip then this code should work
import emoji
text = " Hello, world! "
for char in text:
if emoji.is_emoji(char):
text = text.replace(char, '')
print(text)
Make a list of characters then loop over the string and replace any character that's in the list to blank
import string
removeList = list(string.ascii_lowercase) + list(string.ascii_uppercase)
someStr = "wifjwifjifj"
for v in someStr:
if v in removeList:
someStr = someStr.replace(char, '')
print()
import emoji
text_with_emoji = " Hello, world! "
for char in text_with_emoji:
if emoji.is_emoji(char):
text = text.replace(char, '').strip()
print(text)