343

Here is my code:


import imaplib
from email.parser import HeaderParser

conn = imaplib.IMAP4_SSL('imap.gmail.com')
conn.login('example@gmail.com', 'password')
conn.select()
conn.search(None, 'ALL')
data = conn.fetch('1', '(BODY[HEADER])')
header_data = data[1][0][1].decode('utf-8')

At this point I get the error message:

AttributeError: 'str' object has no attribute 'decode'

Python 3 doesn't have str.decode() anymore, so how can I fix this?

miken32
  • 42,008
  • 16
  • 111
  • 154
  • @MartijnPieters I think the other version is better overall. I don't really get why both questions attracted an answer concerning PyJWT, though. That seems like it belongs on a separate question - one which might not be suitable for Stack Overflow, as it's essentially tech support for that library. – Karl Knechtel Dec 29 '22 at 23:35

15 Answers15

292

You are trying to decode an object that is already decoded. You have a str, there is no need to decode from UTF-8 anymore.

Simply drop the .decode('utf-8') part:

header_data = data[1][0][1]
miken32
  • 42,008
  • 16
  • 111
  • 154
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 10
    Is there a simple way to do this conditionally? (I only want to decode if the message is encoded.) – devinbost Mar 21 '18 at 20:56
  • 12
    @devinbost: in Python 3? Test for the object type or the `decode` attribute, *or* just catch the exception. `try: data = data.decode('...') except AttributeError: pass`. – Martijn Pieters Mar 22 '18 at 07:46
  • 3
    @devinbost: however, you are usually better off decoding closer to the source of your data, where you'll usually know exactly what you have. – Martijn Pieters Mar 22 '18 at 07:46
71

If you land here using jwt authentication after the PyJWT v2.0.0 release (22/12/2020), you might want to freeze your version of PyJWT to the previous release in your requirements.txt file.

PyJWT==1.7.1
Mathieu Rollet
  • 2,016
  • 2
  • 18
  • 31
  • 18
    GIVE THIS PERSON A MEDAL!!!! This was a dependency in our enviroments `rest_framework_simplejwt` package and was causing the issue. – Dfranc3373 Jan 12 '21 at 16:23
  • 3
    Not a safe solution: see CVE-2022-29217 that affects PyJWT 1.x versions: https://github.com/jpadilla/pyjwt/security/advisories/GHSA-ffqj-6fqr-9h24 – Ville Laitila Jun 15 '22 at 06:01
55

Begining with Python 3, all strings are unicode objects.

  a = 'Happy New Year' # Python 3
  b = unicode('Happy New Year') # Python 2

The instructions above are the same. So I think you should remove the .decode('utf-8') part because you already have a unicode object.

Mathieu Rollet
  • 2,016
  • 2
  • 18
  • 31
Neo Ko
  • 1,365
  • 15
  • 25
52

Use it by this Method:

str.encode().decode()
Alireza
  • 806
  • 1
  • 8
  • 10
  • 3
    `bytearray(str, 'encoding').decode('another_encoding')` would do the job if you need to decode `idna` or any other encoding – Alex Jul 11 '17 at 10:41
  • 63
    This is useless. You are encoding to UTF-8, then decoding the resulting bytes as UTF-8, ending up where you started. You are keeping the CPU warm with no other benefit. – Martijn Pieters Feb 09 '18 at 10:10
  • 2
    @MartijnPieters "ending up where you started" - not if you have escape sequences in your string, for example: >>> '\u0159'.encode().decode() 'ř' – Peter Mar 21 '18 at 15:47
  • 5
    @Peter: no, you don't need encoding or decoding for that. `'\u0159'` prints the exact same output. You are confusing the string literal syntax with the canonical representation of the value. – Martijn Pieters Mar 21 '18 at 16:25
  • 3
    You can directly use, There is no need to encode and then decode again. – Aditya Jul 11 '18 at 05:32
  • At the time of writing, 43 people upvoted a comment saying that this answer post is useless, but only 10 people downvoted the answer post. – Flimm Sep 30 '21 at 07:10
32

In Python 3, this mental model is pretty straight-forward:

  • Encoding is the process of converting a str to a bytes object
  • Decoding is the process of converting a bytes object to a str
┏━━━━━━━┓                ┏━━━━━━━┓
┃       ┃ -> encoding -> ┃       ┃
┃  str  ┃                ┃ bytes ┃
┃       ┃ <- decoding <- ┃       ┃
┗━━━━━━━┛                ┗━━━━━━━┛

In your case, you are calling data.decode("UTF-8") , but the variable is already a str object and is already decoded. So just refer to data directly if a string is what you need.

Flimm
  • 136,138
  • 45
  • 251
  • 267
20

For Python3

html = """\\u003Cdiv id=\\u0022contenedor\\u0022\\u003E \\u003Ch2 class=\\u0022text-left m-b-2\\u0022\\u003EInformaci\\u00f3n del veh\\u00edculo de patente AA345AA\\u003C\\/h2\\u003E\\n\\n\\n\\n \\u003Cdiv class=\\u0022panel panel-default panel-disabled m-b-2\\u0022\\u003E\\n \\u003Cdiv class=\\u0022panel-body\\u0022\\u003E\\n \\u003Ch2 class=\\u0022table_title m-b-2\\u0022\\u003EInformaci\\u00f3n del Registro Automotor\\u003C\\/h2\\u003E\\n \\u003Cdiv class=\\u0022col-md-6\\u0022\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003ERegistro Seccional\\u003C\\/label\\u003E\\n \\u003Cp\\u003ESAN MIGUEL N\\u00b0 1\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EDirecci\\u00f3n\\u003C\\/label\\u003E\\n \\u003Cp\\u003EMAESTRO ANGEL D\\u0027ELIA 766\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EPiso\\u003C\\/label\\u003E\\n \\u003Cp\\u003EPB\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EDepartamento\\u003C\\/label\\u003E\\n \\u003Cp\\u003E-\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EC\\u00f3digo postal\\u003C\\/label\\u003E\\n \\u003Cp\\u003E1663\\u003C\\/p\\u003E\\n \\u003C\\/div\\u003E\\n \\u003Cdiv class=\\u0022col-md-6\\u0022\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003ELocalidad\\u003C\\/label\\u003E\\n \\u003Cp\\u003ESAN MIGUEL\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EProvincia\\u003C\\/label\\u003E\\n \\u003Cp\\u003EBUENOS AIRES\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003ETel\\u00e9fono\\u003C\\/label\\u003E\\n \\u003Cp\\u003E(11)46646647\\u003C\\/p\\u003E\\n \\u003Clabel class=\\u0022control-label\\u0022\\u003EHorario\\u003C\\/label\\u003E\\n \\u003Cp\\u003E08:30 a 12:30\\u003C\\/p\\u003E\\n \\u003C\\/div\\u003E\\n \\u003C\\/div\\u003E\\n\\u003C\\/div\\u003E \\n\\n\\u003Cp class=\\u0022text-center m-t-3 m-b-1 hidden-print\\u0022\\u003E\\n \\u003Ca href=\\u0022javascript:window.print();\\u0022 class=\\u0022btn btn-default\\u0022\\u003EImprim\\u00ed la consulta\\u003C\\/a\\u003E \\u0026nbsp; \\u0026nbsp;\\n \\u003Ca href=\\u0022\\u0022 class=\\u0022btn use-ajax btn-primary\\u0022\\u003EHacer otra consulta\\u003C\\/a\\u003E\\n\\u003C\\/p\\u003E\\n\\u003C\\/div\\u003E"""
print(html.replace("\\/", "/").encode().decode('unicode_escape'))
Rex5
  • 771
  • 9
  • 23
krishna chandak
  • 391
  • 5
  • 6
  • 3
    What has this got to do with the question? Can you explain what your answer is doing? – Flimm Sep 30 '21 at 07:12
17

I'm not familiar with the library, but if your problem is that you don't want a byte array, one easy way is to specify an encoding type straight in a cast:

>>> my_byte_str
b'Hello World'

>>> str(my_byte_str, 'utf-8')
'Hello World'
Broper
  • 2,000
  • 1
  • 14
  • 15
  • They don’t have a `bytes` object to begin with, and `str(bytes_object, codec)` is just an alternative spelling for `bytes_object.decode(codec)`. Both fail if you really have a `str` instead. – Martijn Pieters Feb 09 '18 at 10:12
  • 1
    You're right, this specific question does have a `str` already. This answer could still be useful to people in the future that may have byte arrays (this was the issue I faced when I originally stumbled upon this post). – Broper Feb 27 '18 at 17:06
  • I'm not sure how you stumbled on this post, however, because `my_byte_str.decode` exists and works, and will not throw the exception in the question. – Martijn Pieters Feb 28 '18 at 08:29
8

It s already decoded in Python3, Try directly it should work.

Aditya
  • 818
  • 1
  • 10
  • 21
5

This worked for me:

html.replace("\\/", "/").encode().decode('unicode_escape', 'surrogatepass')

This is similar to json.loads(html) behaviour

Duc Toan Pham
  • 474
  • 6
  • 6
3

Use codecs module's open() to read file:

import codecs
with codecs.open(file_name, 'r', encoding='utf-8', errors='ignore') as fdata:
Nikita Jain
  • 669
  • 8
  • 11
3

If anyone getting the same error while participating in Kaggle for a Logistic REgre, here is the solution :

logmodel = LogisticRegression(solver='liblinear')
tlentali
  • 3,407
  • 2
  • 14
  • 21
1

Other answers sort of hint at it, but the problem may arise from expecting a bytes object. In Python 3, decode is valid when you have an object of class bytes. Running encode before decode may "fix" the problem, but it is a useless pair of operations that suggest the problem us upstream.

demongolem
  • 9,474
  • 36
  • 90
  • 105
1

I got 'str' object has no attribute 'decode' while creating JWT access_token using Flask_JWT_extended package.

To fix this issue, I upgraded my Flask-JWT-Extended package to Flask-JWT-Extended==4.1.0

For Reference:

Please Visit this page: https://flask-jwt-extended.readthedocs.io/en/stable/

Anurag Dabas
  • 23,866
  • 9
  • 21
  • 41
1

First install suitable JWT

pip3 install PyJWT

then in your code

token.encode().decode('UTF-8')

this worked me, I think this will help you

0

my case may have been a bit rare but I was working with django and my project was running locally but not when I deployed it, it seemed as though I was getting multiple dependency errors because I was doing: pip freeze > requirements.txt doing this fixed the issue:

pip3 freeze > requirements.txt
Abrar Mahi
  • 65
  • 7