Should I use encoding declaration in Python 3?

Question

Python 3 uses UTF-8 encoding for source-code files by default. Should I still use the encoding declaration at the beginning of every source file? Like # -*- coding: utf-8 -*-

Martijn Pieters · Accepted Answer · 2018-07-03T11:22:15.150

147

Because the default is UTF-8, you only need to use that declaration when you deviate from the default, or if you rely on other tools (like your IDE or text editor) to make use of that information.

In other words, as far as Python is concerned, only when you want to use an encoding that differs do you have to use that declaration.

Other tools, such as your editor, can support similar syntax, which is why the PEP 263 specification allows for considerable flexibility in the syntax (it must be a comment, the text coding must be there, followed by either a : or = character and optional whitespace, followed by a recognised codec).

Note that it only applies to how Python reads the source code. It doesn't apply to executing that code, so not to how printing, opening files, or any other I/O operations translate between bytes and Unicode. For more details on Python, Unicode, and encodings, I strongly urge you to read the Python Unicode HOWTO, or the very thorough Pragmatic Unicode talk by Ned Batchelder.

edited Jul 03 '18 at 11:22

answered Dec 29 '12 at 15:28

Martijn Pieters

1,048,767
296
4,058
3,343

35

The `# -*- coding: utf-8 -*-` may still be useful for some editors to switch to the expected encoding when editing the source file. – pepr Dec 30 '12 at 15:07
1

@pepr A Byte Order Mark could do the same, no? – endolith Jul 03 '17 at 15:01
21

@endolith: the UTF-8 BOM is an abomination on this earth brought forth by Microsoft.. See https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8 – Martijn Pieters Jul 03 '17 at 15:17
2

@MartijnPieters Your link doesn't seem to agree with you – endolith Jul 03 '17 at 15:28
3

@endolith: no, the WP article only summarises the background, it is my own opinion that it is an abomination. The point of a BOM is to record the byte order (hence the name, Byte Order Mark). There is no byte order confusion in UTF-8, it only has that function in UTF-16 and UTF-32. The value is already a re-purposed zero-width no-break space character (handy, as accidental printing then ends up with entirely invisible output), re-using that to be a magic constant is wrong, in my view. – Martijn Pieters Jul 03 '17 at 15:32
@endolith: I agree with UTF-8 BOM being a Microsoft wart. As also the above mentioned wiki page says, it has no meaning. BOM stands for Byte Order Mark. In UTF-8, there is no doubt about the byte order. And the UTF-8 BOM causes problems sometimes (try to concatenate the text files, for example). It can be read also as _this is NOT UTF-16_. Anyway, it is completely unrelated to the `# -*- coding...`. The editor may know the `coding` prescription, and it can completely ignore the BOM. – pepr Jul 04 '17 at 08:29
Can you eleborate what non-standard cases when you need to use `# -*- coding: utf-8 -*-` in python 3? – mrgloom Dec 18 '17 at 12:01
@mrgloom for Python, there are no non-standard cases. But if your editor is not using UTF-8 by default but it does support *modelines* (such as Vim or Emacs or various other code editors), then you can write your comment such that both Python and your editor can both read it, so both use the same encoding when working with your source file. – Martijn Pieters Dec 18 '17 at 12:06
@mrgloom the specific example you used, with the `-*-` markers, is an [emacs modeline](http://www.gnu.org/software/emacs/manual/html_node/emacs/Specifying-File-Variables.html#Specifying-File-Variables). Emacs, reading that line, will set the file encoding to UTF-8, which is very helpful when editing. Vim uses a [different syntax](http://vim.wikia.com/wiki/Modeline_magic), but Python uses pattern matching to support either format as well as others. – Martijn Pieters Dec 18 '17 at 12:11

score 16 · Answer 2 · answered Mar 08 '19 at 09:14

No, if:

entire project use only the UTF-8, which is a default.
and you're sure your IDE tool doesn't need that encoding declaration in each file.

Yes, if

your project relies on different encoding
or relies on many encodings.

For multi-encodings projects:

If some files are encoded in the non-utf-8, then even for these encoded in UTF-8 you should add encoding declaration too, because the golden rule is Explicit is better than implicit.

Reference:

PyCharm doesn't need that declaration:

configuring encoding for specific file in pycharm

vim doesn't need that declaration, but:

# vim: set fileencoding=<encoding name> :

Should I use encoding declaration in Python 3?

2 Answers2

No, if:

Yes, if

Reference:

Linked

Related