17

Trying to import a code from python2 to python 3 and this problem happens

    <ipython-input-53-e9f33b00348a> in aesEncrypt(text, secKey)
     43 def aesEncrypt(text, secKey):
     44     pad = 16 - len(text) % 16
---> 45     text = text.encode("utf-8") + (pad * chr(pad)).encode("utf-8")
     46     encryptor = AES.new(secKey, 2, '0102030405060708')
     47     ciphertext = encryptor.encrypt(text)

AttributeError:'bytes' object has no attribute 'encode'

If I remove .encode("utf-8") the error is "can't concat str to bytes". Apparently pad*chr(pad) seems to be a byte string. It cannot use encode()

    <ipython-input-65-9e84e1f3dd26> in aesEncrypt(text, secKey)
     43 def aesEncrypt(text, secKey):
     44     pad = 16 - len(text) % 16
---> 45     text = text.encode("utf-8") + (pad * chr(pad))
     46     encryptor = AES.new(secKey, 2, '0102030405060708')
     47     ciphertext = encryptor.encrypt(text)

TypeError: can't concat str to bytes

However, the weird thing is that if i just try the part along. encode() works fine.

text = { 'username': '', 'password': '', 'rememberLogin': 'true' }
text=json.dumps(text)
print(text)
pad = 16 - len(text) % 16 
print(type(text))
text = text + pad * chr(pad) 
print(type(pad * chr(pad)))
print(type(text))
text = text.encode("utf-8") + (pad * chr(pad)).encode("utf-8") 
print(type(text))

{"username": "", "password": "", "rememberLogin": "true"}
<class 'str'>
<class 'str'>
<class 'str'>
<class 'bytes'>
Roy Dai
  • 483
  • 2
  • 5
  • 15
  • `chr` returns a string that gets multiplied and encoded as bytes. The problem here is that `text` is bytes. How are you calling `aesEncrypt`? You need to provide a [mre]. – wjandrea Feb 24 '20 at 02:57
  • Python2 strings are implicitly bytes objects while Python3 strings are unicode. So this makes sense. This looks like the right way to do it to me - the way you added .encode('utf-8'). But this won't be backward compatible with Python2 - is that what the issue is? @RoyDai – Todd Feb 24 '20 at 03:04
  • Maybe a portable way to control the codec is to import `codec` and using it's `encode()` and `decode()` module functions. – Todd Feb 24 '20 at 03:07
  • `pad = 16 - len("dummy") % 16; (pad * chr(pad)).encode('utf-8') b'\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b'` Works fine on Python3.. unless there were pad bytes it couldn't encode. – Todd Feb 24 '20 at 03:18
  • @Todd Thanks very much for the reply. I tried .encode('utf-8') works for isolated sessions but it didnt work for first case i dont now why. – Roy Dai Feb 24 '20 at 07:16
  • Put a print in it or trap it with the debugger and examine what value it isn't handling. @RoyDai – Todd Feb 24 '20 at 07:21
  • If you put my conversion function in your code, just temporarily to reproduce the error.. the exception message will tell you what type it's not able to handle. I'm guessing `None` might be being passed in. @RoyDai And maybe set your IDE (Eclipse or whatever) to break on exceptions.. so you can examine the stack. – Todd Feb 24 '20 at 07:39
  • Wait a minute - you're setting text to a dictionary. encode is for unicode strings - it won't work on dictionaries. – Todd Feb 24 '20 at 07:45
  • @Todd Thanks for the follow up. Sorry for the confusion. I have posted another question here(https://stackoverflow.com/questions/60371553/typeerror-cant-concat-str-to-bytes-when-converting-python-2-to-3-with-encrypti). I realise it is more of a "cannot concat str to bytes" problem as "pad*chr(pad)" seem to be a str but byte string or something. – Roy Dai Feb 24 '20 at 07:58
  • Have you read over my answer below about how strings are different on Py2 vs Py3? – Todd Feb 24 '20 at 08:01
  • @Todd Thanks for the ans. I read and I kind of understand there are these difference. Do you mean I shall add a converter function or is there an simpler way? – Roy Dai Feb 24 '20 at 08:10

3 Answers3

7

If you don't know if a stringlike object is a Python 2 string (bytes) or Python 3 string (unicode). You could have a generic converter.

Python3 shell:

>>> def to_bytes(s):
...     if type(s) is bytes:
...         return s
...     elif type(s) is str or (sys.version_info[0] < 3 and type(s) is unicode):
...         return codecs.encode(s, 'utf-8')
...     else:
...         raise TypeError("Expected bytes or string, but got %s." % type(s))
...         
>>> to_bytes("hello")
b'hello'
>>> to_bytes("hello".encode('utf-8'))
b'hello'

On Python 2 both these expressions evaluate to True: type("hello") == bytes and type("hello") == str. And type(u"hello") == str evaluates to False, while type(u"hello") == unicode is True.

On Python 3 type("hello") == bytes is False, and type("hello") == str is True. And type("hello") == unicode raises a NameError exception since unicode isn't defined on 3.

Python 2 shell:

>>> to_bytes(u"hello")
'hello'
>>> to_bytes("hello")
'hello'
Todd
  • 4,669
  • 1
  • 22
  • 30
2

Thanks to @Todd, he solved issue. (pad * chr(pad))is bytes while problems lies with aesEncrypt(text, secKey). It has been called twice with text as str for the first time while as bytes for the second time.

The solution is to make sure that the input text is of str type.

Roy Dai
  • 483
  • 2
  • 5
  • 15
0

Since the first parameter of AES.new is bytes/bytearray/memoryview, and I assume that text is already of type bytes, then we just have to convert the pad part from unicode to bytes.

text = text + (pad * chr(pad)).encode("utf-8")

To be extra safe, you may encode text conditionally before concatenating with pad.

if not isinstance(text, bytes):
    text = text.encode('utf-8')
Carson Ip
  • 1,896
  • 17
  • 27
  • @Todd how about checking if it is not bytes? Kind of lazy, but would work given the input is either bytes or unicode – Carson Ip Feb 24 '20 at 04:19
  • you caught me as I just finished up dissecting this difference - testing if it's not bytes would leave the possibility it could be None or another type that isn't a str or unicode object. See if you can find the answer in my notes on my answer post. let me know if i accounted for what case you have in mind. – Todd Feb 24 '20 at 04:23