1

I need to remove all unicode emojis from a QString, so I tried to write a regex:

    QRegularExpression uTF8Emojis("([\\xD83D][\\xDE00-\\xDFFF])+");

but that does not detect anything...

user1403333
  • 98
  • 12

2 Answers2

2

Since Qt5 QRegularExpression is PCRE-powered, you may use the whole code points for the characters inside \x{...} notation, no need to define these emojis as a sequence of bytes:

"[\\x{1F600}-\\x{1F7FF}]+"

You may use this online converter: paste \uD83D\uDE00-\uD83D\uDFFF into the JavaScript field, and click Convert to get the right codes in the U+hex field.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    Some of the astral characters are not covered in that range, i will recommend 10000-10FFFF. according to https://stackoverflow.com/questions/24672834/how-do-i-remove-emoji-from-string/24673322#24673322 – e.jahandar Jun 20 '17 at 05:56
  • 1
    @e.jahandar: Yes, if there can be no astral chars in the input that one wants to keep. Emojis are numerous, there are more ranges to cover. I just converted OP regex to QT usable form. – Wiktor Stribiżew Jun 20 '17 at 06:30
0

I use this in my QRegularExpression https://github.com/mathiasbynens/rgi-emoji-regex-pattern/blob/main/dist/emoji-14.0/java.txt And it works just fine And author update regexp regularly with recent emojis

  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - [From Review](/review/late-answers/31915585) – lemon Jun 03 '22 at 22:53