254

I was watching a tutorial to dockerize my Django application. I did not understand why we use PYTHONUNBUFFERED as an environment variable in Dockerfile.

Can anyone explain?

Zeitounator
  • 38,476
  • 7
  • 53
  • 66
MayankBudhiraja
  • 2,763
  • 2
  • 7
  • 11

3 Answers3

392

Setting PYTHONUNBUFFERED to a non-empty value different from 0 ensures that the python output i.e. the stdout and stderr streams are sent straight to terminal (e.g. your container log) without being first buffered and that you can see the output of your application (e.g. django logs) in real time.

This also ensures that no partial output is held in a buffer somewhere and never written in case the python application crashes.

Since this has been mentioned in several comments and supplementary answers, note that PYTHONUNBUFFERED has absolutely no influence on the input (i.e. the stdin stream).

In other words, turning off buffering to stdout/stderr in a docker container is mainly a concern of getting as much information from your running application as fast as possible in the container log and not loosing anything in case of a crash.

Note that turning buffering off can have an impact on performance depending on your hardware/environment. Meanwhile it should be minor in most situations (unless you have slow disks or are writing a tremendous amount of logs or had the bad idea to configure your docker daemon to write your logs on a slow network drive...). If this is a concern, buffering can be left on and you can flush the buffer directly from your application when needed. See link [4] below on this subject.

References:

Zeitounator
  • 38,476
  • 7
  • 53
  • 66
  • Force the stdout and stderr streams to be unbuffered. This option has no effect on the stdin stream. See also PYTHONUNBUFFERED. Changed in version 3.7: The text layer of the stdout and stderr streams now is unbuffered. – Louis Huang Jun 18 '22 at 10:13
  • 3
    FYI: Setting `PYTHONUNBUFFERED=0` has the same effect as unset or an empty string. This is because CPython tries to parse the env var using `strtol(3)` (full-string match). – iBug Jul 25 '22 at 10:40
  • Isn't python buffering meant as an optimization to not be waiting on io for every little printed word? Shouldn't this option be exclusively used in development scenarios? – N1ngu Oct 29 '22 at 08:44
  • 1
    @ N1ngu Remember we are talking about running python inside docker here. Besides the links already given above, a few links I could find talking about the exact possible problem. Mainly, the problem is having the logs in real time in case something goes wrong and the container crashes. You can probably find more: https://stackoverflow.com/questions/39486327/stdout-being-buffered-in-docker-container http://www.pixelbeat.org/programming/stdio_buffering/ https://github.com/docker/compose/issues/1838 https://serverfault.com/questions/940281/why-doesnt-my-docker-actually-log-anything ... – Zeitounator Oct 29 '22 at 10:16
  • @Zeitounator thanks for all the links and the increased answer detail. I still do not understand **why this is an issue in docker but not in containerless deployments?**. Couldn't a bare process die before flushing any filedescriptor? Why nobody cared about this before docker existed? For a typical app using standard logging, `logging.shutdown` will already flush all handlers at exit, and AFAIU also any error event being recorded? Why would anyone worry about it? (Besides buffering-unaware devs expecting realtime `print("foobar")` instructions) – N1ngu Oct 31 '22 at 18:40
  • Maybe because docker images are meant on most occasion to to be deployed on systems where logs are directed to a a sink (like kubernetes or openshift) and you expect to get everything in there even after a crash. This is not a forum but a Q/A site. If you feel like this deserves a question, just ask one. Note: I would disable bufering for python output for django on a classic server as well if I had to run in such a scenario. You are free to do as you wish and I don't think this deserves an opinion based discussion here. – Zeitounator Nov 01 '22 at 21:07
16

A PYTHONUNBUFFERED non-empty value forces the stdout and stderr streams to be unbuffered. This option has no effect on the stdin stream.

jtlz2
  • 7,700
  • 9
  • 64
  • 114
Shreeyansh Jain
  • 1,407
  • 15
  • 25
6

This instructs Python to run in UNBUFFERED mode, which is recommended when using Python inside a Docker container. The reason for this is that it does not allow Python to buffer outputs; instead, it prints output directly, avoiding some complications in the docker image when running your Python application.

  • What complications? I am looking to understand the rationale behind this and I'd welcome some links. So far I fear this is a massive mindless copy-paste from a Django tutorial. I never saw a production deployment outside Docker use this. – N1ngu Oct 29 '22 at 08:47
  • 1
    @N1ngu I fear this is a comment that was written a bit too fast without a proper previous research and containing a far too early subjective (and possibly hurting) judgement on author's skills. Having a major crash of an application (in development or in production) without any traces in the logs can be a major blocker to understand what happened exactly. This can be called a complication (even if I agree that details should be given). – Zeitounator Oct 30 '22 at 07:52
  • @Zeitounator that's fair – N1ngu Oct 31 '22 at 09:16