0
!pip install emot
from emot.emo_unicode import EMOTICONS_EMO
def convert_emoticons(text):
    for emot in EMOTICONS_EMO:
        text = re.sub(u'\('+emot+'\)', "_".join(EMOTICONS_EMO[emot].replace(",","").split()), text)
        return text

text = "Hello :-) :-)"
convert_emoticons(text)

I'm trying to run the above code in google collab, but it gives the following error: unbalanced parenthesis at position 4

My undesrtanding from the re module documentation tells that '\(any_expression'\)' is correct way to use, but I still get the error. So, I'have tried replacing '\(' + emot + '\) with:

  1. '(' + emot + ')', it gives the same error
  2. '[' + emot + ']', it gives the following output: Hello Happy_face_or_smiley-Happy_face_or_smiley Happy_face_or_smiley-Happy_face_or_smiley

The correct output should be Hello Happy_face_smiley Happy_face_smiley for text = "Hello :-) :-)"

Can someone help me fix the problem?

M J
  • 379
  • 2
  • 8
  • Note that if you're writing in Python 3, the `u` prefix [is now redundant](https://stackoverflow.com/a/2464968/11659881). – Kraigolas Feb 01 '22 at 04:18
  • What is the point of the parentheses? – Tim Biegeleisen Feb 01 '22 at 04:19
  • Also, welcome to StackOverflow! Please see how to create a [mre] and then [edit] the question to include an example. It is not necessary to loop through "EMOTICONS_EMO", just pick an instance of `emot` that produces the error and use that explicitly so we can verify your error. – Kraigolas Feb 01 '22 at 04:20
  • Perhaps you need `text = "Hello (-: :-)"` :-) – Cary Swoveland Feb 01 '22 at 05:00
  • First off, it's better to use "raw" strings when defining REs `r"..."`. Those don't interpret escapes specially. – Keith Feb 01 '22 at 05:19

1 Answers1

1

This is pretty tricky using regex, as you'd first need to escape the metachars in the regex that are contained in the emoji, such as :) and :(, which is why you get the unbalanced parens. So, you'd need to do something like this first:

>>> print(re.sub(r'([()...])', r'%s\1' % '\\\\', ':)'))
:\)

But I'd suggest just doing a straight replacement since you already have a mapping that you're iterating through it. So we'd have:

from emot.emo_unicode import EMOTICONS_EMO
def convert_emoticons(text):
    for emot in EMOTICONS_EMO:
        text = text.replace(emot, EMOTICONS_EMO[emot].replace(" ","_"))
    return text


text = "Hello :-) :-)"
convert_emoticons(text)
# 'Hello Happy_face_smiley Happy_face_smiley'
David542
  • 104,438
  • 178
  • 489
  • 842