1

I have a complex algorithm, one module is written in Python and another in JavaScript.

At some point these modules calculate string length.

The problem is when the text contains emojis, Python and JS give different results.

Example:

  • '' Python len: 1 ; JS len: 2
  • '' Python len: 2 ; JS len: 4

The JS's length is more convenient in my case. Is there a way to make Python3 calculate the emoji's length as JS does?

Jean DuPont
  • 411
  • 7
  • 22
  • 1
    Javascript strings are encoded bytes, so you need to encode the Python string to the same encoding as used by Javascript._Probably_ 'utf-16-le', but see for example https://stackoverflow.com/questions/2219526/how-many-bytes-in-a-javascript-string – snakecharmerb Oct 17 '21 at 08:00

1 Answers1

0

Here's the Python function to calculate string length that can contain emoji. It returns the same result as JS:

def jslen(string):
    return int(len(string.encode(encoding='utf_16_le'))/2)

Thank you @snakecharmerb for your hint!

Jean DuPont
  • 411
  • 7
  • 22