Convert string with accent into numbers (RSA encryption)

Question

My math teacher asked us to program the RSA encryption/decryption process in python. So I've created the following function: lettre_chiffre(T) which convert each character in the string into a number with the ord() function chiffre_lettre(T) which does the opposite with chr() And as these functions create 4 numbers blocks I need to encrypted in RSA with 5 numbers block to prevent frequency analysis. The problem is the ord function doesn't works well with french accents "é" "à"... Therefore, I was interested by using the bytearray method, but I have no idea how to use it.

How can I make this program works with accents. The encryption and decryption in byte with bytearray is not working with "é" and "à" for example.

python 

def lettre_chiffre(T):
    Message_chiffre = str('')
    for lettre in T:
        if ord(lettre) < 10000:
            nombre = str(ord(lettre))
            while len(nombre) != 4:
                nombre = str('0') + nombre
            Message_chiffre += nombre
        else:
            print("erreur lettre : ",lettre)
    while len(Message_chiffre)%4 != 0:
        Message_chiffre = str('0') + Message_chiffre
    return str(Message_chiffre)

def chiffre_lettre(T):
    Message_lettre = str('')
    A =T
    for i in range(int(len(str(A))/4)):
        nombre = str(A)[4*i:4*i+4]
        if int(nombre) < 10000:
            Message_lettre += str(chr(int(nombre)))
    return Message_lettre

It's not because the problem is always there with this solution: there are accent "é","à"... — JulesL, May 25 '19 at 16:59
So i can't code the string "é" into bytes and then decode it. It won't give me "é"? — JulesL, May 25 '19 at 17:06
What is not working? encoding, decoding, encyrption or decryption? Can you provide a [mcve], please? — Thomas Weller, May 25 '19 at 17:11
>>> test = bytearray("é",'utf8') >>> print(test) bytearray(b'\xc3\xa9') and when i decode this I havent "é" — JulesL, May 25 '19 at 17:14
Accented characters can often be represented in two ways: as a single codepoint - "LATIN SMALL LETTER E WITH ACUTE" or as the "ascii" character and another character representing the accent - "LATIN SMALL LETTER E" + "COMBINING ACUTE ACCENT". The latter version doesn't work with `ord` because `ord` expects a single character, not two. You can use [unicodedata.normalize](https://docs.python.org/3.7/library/unicodedata.html#unicodedata.normalize) to convert from the two-char decomposed version to the single char composed version. — snakecharmerb, May 25 '19 at 17:20
@ThomasWeller - sure, I'm just explaining why `ord` isn't handling these characters. — snakecharmerb, May 25 '19 at 17:47
I understand the problem with the 2 character for é. But as I need every character as an integer to use my rsa encryption/decryption function, the decoded message will be an int. And How am i supposed to convert the number in a string like "é" without the chr() method and with a string.decode()? — JulesL, May 25 '19 at 17:54
RSA does not need single characters as integers. It operates on numbers and a byte is a number. RSA does not care about the underlying encoding. Sorry, I'll leave this discussion, since I don't see this coming to an end. — Thomas Weller, May 25 '19 at 18:41

score 0 · Answer 1 · answered May 25 '19 at 19:36

0

Refer this post: https://stackoverflow.com/a/2788599

What you need is

>>> '\xc3\xa9'.decode('utf8')
u'\xe9'
>>> u = '\xc3\xa9'.decode('utf8')
>>> u
u'\xe9'
>>> ucd.name(u)
'LATIN SMALL LETTER E WITH ACUTE'

answered May 25 '19 at 19:36

Huzefa Jambughoda

30
6

And the 'e'.encode('utf8') might also be useful to you. – Huzefa Jambughoda May 25 '19 at 19:37

Convert string with accent into numbers (RSA encryption)

1 Answers1