630

I came across the following header format for Python source files in a document about Python coding guidelines:

#!/usr/bin/env python

"""Foobar.py: Description of what foobar does."""

__author__      = "Barack Obama"
__copyright__   = "Copyright 2009, Planet Earth"

Is this the standard format of headers in the Python world? What other fields/information can I put in the header? Python gurus share your guidelines for good Python source headers :-)

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
Ashwin Nanjappa
  • 76,204
  • 83
  • 211
  • 292
  • Here's a good place to start: [PEP 257](http://www.python.org/dev/peps/pep-0257/), which talks about Docstrings, and links to several other relevant documents. – Peter Oct 06 '09 at 03:25
  • haha great @JonathanHartley ! For my own projects, as you put it "I indulge my OCD fetish." hahaaha https://stackoverflow.com/a/51914806/1896134 – JayRizzo Sep 03 '18 at 22:00
  • 2
    Obsessive–compulsive disorder (OCD) – Timo Nov 24 '20 at 17:55

5 Answers5

686

Its all metadata for the Foobar module.

The first one is the docstring of the module, that is already explained in Peter's answer.

How do I organize my modules (source files)? (Archive)

The first line of each file shoud be #!/usr/bin/env python. This makes it possible to run the file as a script invoking the interpreter implicitly, e.g. in a CGI context.

Next should be the docstring with a description. If the description is long, the first line should be a short summary that makes sense on its own, separated from the rest by a newline.

All code, including import statements, should follow the docstring. Otherwise, the docstring will not be recognized by the interpreter, and you will not have access to it in interactive sessions (i.e. through obj.__doc__) or when generating documentation with automated tools.

Import built-in modules first, followed by third-party modules, followed by any changes to the path and your own modules. Especially, additions to the path and names of your modules are likely to change rapidly: keeping them in one place makes them easier to find.

Next should be authorship information. This information should follow this format:

__author__ = "Rob Knight, Gavin Huttley, and Peter Maxwell"
__copyright__ = "Copyright 2007, The Cogent Project"
__credits__ = ["Rob Knight", "Peter Maxwell", "Gavin Huttley",
                    "Matthew Wakefield"]
__license__ = "GPL"
__version__ = "1.0.1"
__maintainer__ = "Rob Knight"
__email__ = "rob@spot.colorado.edu"
__status__ = "Production"

Status should typically be one of "Prototype", "Development", or "Production". __maintainer__ should be the person who will fix bugs and make improvements if imported. __credits__ differs from __author__ in that __credits__ includes people who reported bug fixes, made suggestions, etc. but did not actually write the code.

Here you have more information, listing __author__, __authors__, __contact__, __copyright__, __license__, __deprecated__, __date__ and __version__ as recognized metadata.

Community
  • 1
  • 1
Esteban Küber
  • 36,388
  • 15
  • 79
  • 97
  • 8
    Can the creation of the header info somehow be automated for new files? – Hauke May 04 '11 at 07:51
  • 233
    I think all of this metadata after the imports is a bad idea. The parts of this metadata that apply to a single file (eg author, date) are already tracked by source control. Putting an erroneous & out of date copy of the same info in the file itself seems wrong to me. The parts that apply to the whole project (eg licence, versioning) seem better located at a project level, in a file of their own, rather than in every source code file. – Jonathan Hartley Feb 10 '12 at 09:17
  • 37
    Agree totally with Jonathan Hartley. The next person to inherit the code has three choices: 1) update it all every time he/she edits the code 2) leave it alone, in which case it will be inaccurate 3) delete it all. Option 1 is a waste of their time, especially since they have absolutely zero confidence that the metadata was up to date when they received it. Options 2 and 3 mean that your time in putting it there in the first place was wasted. Solution: save everybody's time and don't put it there. – spookylukey Feb 10 '12 at 15:24
  • 1
    I've also seen uses of `__usage__` and `__epilog__` on modules that use `argparse`/`optparse` to show command-line help. – Paulo Freitas Jul 27 '12 at 18:45
  • 92
    There's no reason for most Python files to have a shebang line. – Mike Graham Feb 28 '13 at 13:36
  • 3
    "Any changes to the path" is configuration, not code, and does not belong in a normal source code file. If it did go there (which it shouldn't), it should be very high and very visible so that people can see the wackiness that's going on. – Mike Graham Feb 28 '13 at 13:38
  • 18
    Per PEP 8, `__version__` needs to be directly following the main docstring, with a blank line before and after. Also, it is best practice to define your charset immediately under the shebang - `# -*- coding: utf-8 -*-` – Dave Lasley Mar 22 '14 at 01:00
  • 6
    @DaveLasley Unless you're using Python 3, which defaults to UTF-8 anyway. – alexia Apr 07 '15 at 10:29
  • 4
    You should avoid messing around with `sys.path` at all. If you have to do that, you're doing something wrong. – alexia Apr 07 '15 at 10:30
  • 1
    How can I indicate that __license__ is noncommercial? – lmiguelvargasf Jun 20 '15 at 20:20
  • 1
    @lmiguelvargasf put the license in a file called `LICENSE` in the root directory, and maybe put a `__license__ = 'MIT'` or whatever along with version in an `__init__.py` file. Replication is repeating yourself, error prone, and you can't exepct every contributor to maintain the diligent structure this answer suggests. – Permafacture Jul 17 '15 at 03:38
  • Debian docs recommends `#!/usr/bin/python` for the shebang/interpreter directive. Or whatever interpreter is used locally, based on what `which python` returns (or python3). – noobninja Jan 20 '17 at 18:10
  • @dave-lasley I can't see [PEP 8](https://www.python.org/dev/peps/pep-0008/) mentioning any certain order of _dunder names_ (e.g. `__version__`). has this been revised? – antiplex Jul 26 '18 at 09:59
  • 4
    These comments are valid for modules made up of multiple scripts. Emailing a standalone `.py` file to an external collaborator is an example use case IMHO. – Aaron Apr 24 '19 at 09:12
  • Someone should mention that `__future__` imports should come before other built-ins. – michen00 Nov 23 '22 at 01:44
216

I strongly favour minimal file headers, by which I mean just:

#!/usr/bin/env python # [1]
"""\
This script foos the given bars [2]

Usage: myscript.py BAR1 BAR2
"""
import os   # standard library, [3]
import sys

import requests  # 3rd party packages

import mypackage  # local source
  • [1] The hashbang if, and only if, this file should be able to be directly executed, i.e. run as myscript.py or myscript or maybe even python myscript.py. (The hashbang isn't used in the last case, but providing it gives users the choice of executing it either way.) The hashbang should not be included if the file is a module, intended just to be imported by other Python files.
  • [2] Module docstring
  • [3] Imports, grouped in the standard way, ie. three groups of imports, with a single blank line between them. Within each group, imports are sorted. The final group, imports from local source, can either be absolute imports as shown, or explicit relative imports.

Everything else is a waste of time - both for the author and for subsequent maintainers. It wastes the precious visual space at the top of the file with information that is better tracked elsewhere, and is easy to get out of date and become actively misleading.

If you have legal disclaimers or licensing info, it goes into a separate file. It does not need to infect every source code file. Your copyright should be part of this. People should be able to find it in your LICENSE file, not random source code.

Metadata such as authorship and dates is already maintained by your source control. There is no need to add a less-detailed, erroneous, and out-of-date version of the same info in the file itself.

I don't believe there is any other data that everyone needs to put into all their source files. You may have some particular requirement to do so, but such things apply, by definition, only to you. They have no place in “general headers recommended for everyone”.

Jonathan Hartley
  • 15,462
  • 9
  • 79
  • 80
  • 29
    Couldn't agree more - it's a sin to replicate code in multiple places so why do the same for a header information. Put it in a single place (project root) and avoid the hassle of maintaining such information across many, many files. – Graeme Oct 07 '12 at 17:28
  • 15
    While I agree that source control tends to provide more valid authorship information, sometimes authors only distribute the source without giving access to repository, or maybe that's just the way the distribution work, e.g.: centralized installation from pypi. Thus, embedding authorship information as a module header is still beneficial. – pram Apr 03 '13 at 10:01
  • 6
    Hey Pram. I'm having trouble envisaging a use case where that's actually useful. I can imagine someone wanting to know authorship information for the project as a whole, and they can get value from a list of major contributors in a single central place, maybe the project's README or docs. But who would (a) want to know the authorship of *individual files*, and (b) wouldn't have access to the source repo, and (c) wouldn't care that there's never a way to tell if the information was incorrect or out of date? – Jonathan Hartley Apr 04 '13 at 14:51
  • 16
    Many licenses require you to include the license boilerplate in each file for a very good reason. If someone takes a file or two and redistributes them without the license, the people receiving it have no idea what license it's under, and will have to trace it down (assuming they're in good faith, that is). – alexia Apr 07 '15 at 10:26
  • [Read this](http://stackoverflow.com/a/6878709/492203), many of it can also be applied to non-GPL licenses. – alexia Apr 07 '15 at 10:33
  • Thanks for the link @nyuszika7h. I don't agree with the reasons given there much, but each to their own. – Jonathan Hartley Apr 08 '15 at 19:24
  • 5
    Lots of modules (scipy, numpy, matplotlib) have `__version__` metadata, though, and I think that's good to have, since it should be accessible to programs and to check quickly in the interactive interpreter. Authorship and legal information belongs in a different file, though. Unless you have a use case for `if 'Rob' in __author__:` – endolith Aug 20 '15 at 14:50
  • 1
    @endolith A `__version__` attribute is a great idea: it's read by pydoc, etc. However, such a tag should exist in one place in your project: namely the top level package's `__init__.py`. It should not be added to every file. – Jonathan Hartley Aug 20 '15 at 20:16
  • @endolith Authorship doesn't belong in a separate file: It already exists in source code control. Adding manually-maintained (i.e. wrong and broken and out of date) second copy of this information is not useful. – Jonathan Hartley Aug 20 '15 at 20:17
  • 1
    I generally favour DRY quite strongly, but developers generally aren't good at keeping docs up-to-date unless they're in front of their noses, and as others have noted, there are legal reasons for repeating copyright notices. (Lawyers love promoting paper use, and they probably think source code gets printed a lot.) – Michael Scheper Nov 16 '15 at 02:58
  • 1
    I do not find any answer from a user with the name *voyager*. Has this answer been deleted, has the username changed, or is there another reason why I can't find it? – gerrit Feb 28 '19 at 09:52
  • @gerrit Thanks, you are right. I don't know where it went. I'll try to reproduce what he said in my answer instead. – Jonathan Hartley Feb 28 '19 at 14:53
51

The answers above are really complete, but if you want a quick and dirty header to copy'n paste, use this:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""Module documentation goes here
   and here
   and ...
"""

Why this is a good one:

  • The first line is for *nix users. It will choose the Python interpreter in the user path, so will automatically choose the user preferred interpreter.
  • The second one is the file encoding. Nowadays every file must have a encoding associated. UTF-8 will work everywhere. Just legacy projects would use other encoding.
  • And a very simple documentation. It can fill multiple lines.

See also: https://www.python.org/dev/peps/pep-0263/

If you just write a class in each file, you don't even need the documentation (it would go inside the class doc).

aronadaal
  • 9,083
  • 1
  • 19
  • 33
neves
  • 33,186
  • 27
  • 159
  • 192
  • 13
    > "Nowadays every file must have a encoding associated." This seems misleading. utf8 is the default encoding, so it's perfectly fine to not specify it. – Jonathan Hartley Feb 28 '19 at 14:52
  • 1
    @JonathanHartley in Python 2 it wasn't the default. I like to put it since "explicit is better than implicit". – neves Jun 15 '20 at 22:31
  • 6
    I agree that makes sense if you use any Python 2. For Python3, personally I'm happy to rely on implicit when the default is sensible and universal. We don't explicitly define the meaning of "+" whenever we use it. – Jonathan Hartley Jun 16 '20 at 14:48
25

Also see PEP 263 if you are using a non-ascii characterset

Abstract

This PEP proposes to introduce a syntax to declare the encoding of a Python source file. The encoding information is then used by the Python parser to interpret the file using the given encoding. Most notably this enhances the interpretation of Unicode literals in the source code and makes it possible to write Unicode literals using e.g. UTF-8 directly in an Unicode aware editor.

Problem

In Python 2.1, Unicode literals can only be written using the Latin-1 based encoding "unicode-escape". This makes the programming environment rather unfriendly to Python users who live and work in non-Latin-1 locales such as many of the Asian countries. Programmers can write their 8-bit strings using the favorite encoding, but are bound to the "unicode-escape" encoding for Unicode literals.

Proposed Solution

I propose to make the Python source code encoding both visible and changeable on a per-source file basis by using a special comment at the top of the file to declare the encoding.

To make Python aware of this encoding declaration a number of concept changes are necessary with respect to the handling of Python source code data.

Defining the Encoding

Python will default to ASCII as standard encoding if no other encoding hints are given.

To define a source code encoding, a magic comment must be placed into the source files either as first or second line in the file, such as:

      # coding=<encoding name>

or (using formats recognized by popular editors)

      #!/usr/bin/python
      # -*- coding: <encoding name> -*-

or

      #!/usr/bin/python
      # vim: set fileencoding=<encoding name> :

...

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
9

What I use in some project is this line in the first line for Linux machines:

# -*- coding: utf-8 -*-

As a DOC & Author credit, I like simple string in multiline. Here an example from Example Google Style Python Docstrings

# -*- coding: utf-8 -*-
"""Example Google style docstrings.

This module demonstrates documentation as specified by the `Google Python
Style Guide`_. Docstrings may extend over multiple lines. Sections are created
with a section header and a colon followed by a block of indented text.

Example:
    Examples can be given using either the ``Example`` or ``Examples``
    sections. Sections support any reStructuredText formatting, including
    literal blocks::

        $ python example_google.py

Section breaks are created by resuming unindented text. Section breaks
are also implicitly created anytime a new section starts.

Attributes:
    module_level_variable1 (int): Module level variables may be documented in
        either the ``Attributes`` section of the module docstring, or in an
        inline docstring immediately following the variable.

        Either form is acceptable, but the two should not be mixed. Choose
        one convention to document module level variables and be consistent
        with it.

Todo:
    * For module TODOs
    * You have to also use ``sphinx.ext.todo`` extension

.. _Google Python Style Guide:
   http://google.github.io/styleguide/pyguide.html

"""

Also can be nice to add:

        """
        @Author: ...
        @Date: ....
        @Credit: ...
        @Links: ...
        """

Additional Formats

  • Meta-information markup | devguide

    """

          :mod:`parrot` -- Dead parrot access
          ===================================
    
          .. module:: parrot
             :platform: Unix, Windows
             :synopsis: Analyze and reanimate dead parrots.
          .. moduleauthor:: Eric Cleese <eric@python.invalid>
          .. moduleauthor:: John Idle <john@python.invalid>
      """
    
  • /common-header-python

          #!/usr/bin/env python3  Line 1
          # -*- coding: utf-8 -*- Line 2
          #----------------------------------------------------------------------------
          # Created By  : name_of_the_creator   Line 3
          # Created Date: date/month/time ..etc
          # version ='1.0'
          # ---------------------------------------------------------------------------
    

Also I report similarly to other answers

__author__ = "Rob Knight, Gavin Huttley, and Peter Maxwell"
__copyright__ = "Copyright 2007, The Cogent Project"
__credits__ = ["Rob Knight", "Peter Maxwell", "Gavin Huttley",
                    "Matthew Wakefield"]
__license__ = "GPL"
__version__ = "1.0.1"
__maintainer__ = "Rob Knight"
__email__ = "rob@spot.colorado.edu"
__status__ = "Production"
Federico Baù
  • 6,013
  • 5
  • 30
  • 38