101

I understand that ".pyc" files are compiled versions of the plain-text ".py" files, created at runtime to make programs run faster. However I have observed a few things:

  1. Upon modification of "py" files, program behavior changes. This indicates that the "py" files are compiled or at least go though some sort of hashing process or compare time stamps in order to tell whether or not they should be re-compiled.
  2. Upon deleting all ".pyc" files (rm *.pyc) sometimes program behavior will change. Which would indicate that they are not being compiled on update of ".py"s.

Questions:

  • How do they decide when to be compiled?
  • Is there a way to ensure that they have stricter checking during development?
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Aaron Schif
  • 2,421
  • 3
  • 17
  • 29
  • 16
    Beware of deleting .pyc files with `rm *.pyc`. This will not delete .pyc files in nested folders. Use `find . -name '*.pyc' -delete` instead – Zags Oct 07 '14 at 19:53
  • 6
    Perhaps one note on your question: A program doesn't run any faster when it is read from a ‘.pyc’ or ‘.pyo’ file than when it is read from a ‘.py’ file; the only thing that's faster about ‘.pyc’ or ‘.pyo’ files is the speed with which they are loaded. [link](http://www.network-theory.co.uk/docs/pytut/CompiledPythonfiles.html) – maggie Oct 07 '15 at 06:24
  • @maggie what's the difference between loading and execution time? –  Dec 06 '16 at 18:50
  • 3
    @Dani loading is the time it takes to read and then compile the program. Execution time is when the program is actually being run which happens after loading. If you want to be technical, the time types are load time, compile time, link time, and execution time. Making a .pyc eliminates the compile time part. – Eric Klien May 23 '17 at 03:49
  • @EricKlien thanks man –  May 23 '17 at 04:03

2 Answers2

91

The .pyc files are created (and possibly overwritten) only when that python file is imported by some other script. If the import is called, Python checks to see if the .pyc file's internal timestamp is not older than the corresponding .py file. If it is, it loads the .pyc; if it isn't or if the .pyc does not yet exist, Python compiles the .py file into a .pyc and loads it.

What do you mean by "stricter checking"?

David Resnick
  • 4,891
  • 5
  • 38
  • 42
DaveTheScientist
  • 3,299
  • 25
  • 19
  • 3
    I am able to fix problems with `rm *.pyc`. I know that if I force all the files to be recreated then some issues are fixed, indicating that the files are not being re-compiled by themselves. I suppose that if they do use the timestamps then there is no way to make this behavior stricter, but the problem still persists. – Aaron Schif Apr 05 '13 at 17:34
  • 17
    This is not quite correct. The timestamps don't need to match (and they usually don't). The `.pyc`'s timestamp must be *older* than the corresponding `.py`'s timestamp to trigger a recompilation. – Tim Pietzcker Apr 05 '13 at 17:35
  • @TimPietzcker Ah, of course that makes sense. Good to know. And OP, there are also other kinds of files that can be imported by Python. A `.so`, for example, is a compiled C extension that can be called by Python as if it were a `.pyc` file. Beyond that I'd need more details about your problem to offer anything else. – DaveTheScientist Apr 05 '13 at 17:38
  • 5
    @Aaron, Are you possibly changing the .py files, and in the process making them older (e.g. by copying them in from another dir, using an operation which preserves 'modification time')? – greggo Apr 05 '13 at 18:07
  • 1
    @greggo, I'm using git and updating from a repository, so yes in a way I am. That could do it. Thanks. – Aaron Schif Apr 05 '13 at 18:47
  • 1
    *Good to know.* How about correcting your answer then? – Piotr Dobrogost Sep 06 '17 at 12:33
  • 1
    I don't think @TimPietzcker's comment is correct. Here's the relevant source for Python 3.6: https://github.com/python/cpython/blob/4134f154ae2f621f25c5d698cc0f1748035a1b88/Lib/importlib/_bootstrap_external.py#L470. The last-modified timestamp on the `.py` file has to _exactly_ match the timestamp that's embedded in the header of the `.pyc` file. (The timestamp of the `.pyc` file itself is irrelevant, and more recent versions of Python don't use timestamps at all in the `.pyc` header.) – Mark Dickinson Mar 17 '21 at 12:09
35

.pyc files generated whenever the corresponding code elements are imported, and updated if the corresponding code files have been updated. If the .pyc files are deleted, they will be automatically regenerated. However, they are not automatically deleted when the corresponding code files are deleted.

This can cause some really fun bugs during file-level refactors.

First of all, you can end up pushing code that only works on your machine and on no one else's. If you have dangling references to files you deleted, these will still work locally if you don't manually delete the relevant .pyc files because .pyc files can be used in imports. This is compounded with the fact that a properly configured version control system will only push .py files to the central repository, not .pyc files, meaning that your code can pass the "import test" (does everything import okay) just fine and not work on anyone else's computer.

Second, you can have some pretty terrible bugs if you turn packages into modules. When you convert a package (a folder with an __init__.py file) into a module (a .py file), the .pyc files that once represented that package remain. In particular, the __init__.pyc remains. So, if you have the package foo with some code that doesn't matter, then later delete that package and create a file foo.py with some function def bar(): pass and run:

from foo import bar

you get:

ImportError: cannot import name bar

because python is still using the old .pyc files from the foo package, none of which define bar. This can be especially problematic on a web server, where totally functioning code can break because of .pyc files.

As a result of both of these reasons (and possibly others), your deployment code and testing code should delete .pyc files, such as with the following line of bash:

find . -name '*.pyc' -delete

Also, as of python 2.6, you can run python with the -B flag to not use .pyc files. See How to avoid .pyc files? for more details.

See also: How do I remove all .pyc files from a project?

Community
  • 1
  • 1
Zags
  • 37,389
  • 14
  • 105
  • 140
  • "When you convert a module (a folder with an `__init__.py` file)...". That would be a package, not a module. – bgrant Sep 01 '15 at 18:11
  • 2
    *In particular, the `__init__.pyc` remains.* – How come? As a package is a directory deleting a package means deleting directory thus there are no files left… – Piotr Dobrogost Sep 06 '17 at 12:30
  • 3
    @PiotrDobrogost Properly managed source control involves not checking your pyc files into source. So while you may delete the folder, including pyc files, in your local copy, it will not be deleted for someone else who does a git pull. This can crash your server if your deployment involves a git pull as well. – Zags Sep 06 '17 at 15:11
  • There are many reasons to not trust your dev environment to be representative of where your code will be deployed. This `.pyc` issue is one reason, also: hidden dependencies on OS and utility patch levels, `.so` files, config files, other Python libs (if you're not running in a virtual env), obscure env vars ... the list goes on. To be thorough and find all such issues, you need to make a clean copy of your code in a git repo or publish as a package to a PyPi style server, and do a full clone or setup on a fresh VM. Some of those potential problems make this `.pyc` issue pale in comparison. – Chris Johnson Sep 29 '17 at 02:58