7

A colleague and I have a big Docker puzzle.

When we run the following commands we get different results.

docker run -it python:3.8.6 /bin/bash
pip install fbprophet

For me, it installs perfectly, while for him it produces an error and fails to install. I thought the whole point of docker is to prevent this kind of issue, so I'm really puzzled.

I'm giving more details below, but my main question is:

  • How is it possible that we get different results?

More details:

We both are running Docker in a new MacBook Pro with similar specs, on Catalina. His Docker engine version 20.x.x is slightly newer than mine 19.X.X. Also:

  • He tried all the commands he could think of to clean up things in Docker.
  • We verified that the hashes of the image IDs were the same.
  • Our resource settings were also the same.
  • He tried reinstalling Docker and changing to other versions of python (3.7).
  • We tried simultaneously on multiple occasions during the last three days.

The result was always the same: He gets the error and I don't.

The error he gets is the following.

Error:
Installing collected packages: six, pytz, python-dateutil, pymeeus, numpy, pyparsing, pillow, pandas, korean-lunar-calendar, kiwisolver, ephem, Cython, cycler, convertdate, tqdm, setuptools-git, pystan, matplotlib, LunarCalendar, holidays, cmdstanpy, fbprophet
    Running setup.py install for fbprophet ... error
    ERROR: Command errored out with exit status 1:
     command: /usr/local/bin/python -u -c ‘import sys, setuptools, tokenize; sys.argv[0] = ‘“’”‘/tmp/pip-install-l516b8ts/fbprophet_80d5f400081541a2bf6ee26d2785e363/setup.py’“‘”’; __file__=‘“’”‘/tmp/pip-install-l516b8ts/fbprophet_80d5f400081541a2bf6ee26d2785e363/setup.py’“‘”’;f=getattr(tokenize, ‘“’”‘open’“‘”’, open)(__file__);code=f.read().replace(‘“’”‘\r\n’“‘”’, ‘“’”‘\n’“‘”’);f.close();exec(compile(code, __file__, ‘“’”‘exec’“‘”’))' install --record /tmp/pip-record-7n8tvfkb/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.8/fbprophet
         cwd: /tmp/pip-install-l516b8ts/fbprophet_80d5f400081541a2bf6ee26d2785e363/
    Complete output (10 lines):
    running install
    running build
    running build_py
    creating build
    creating build/lib
    creating build/lib/fbprophet
    creating build/lib/fbprophet/stan_model
    Importing plotly failed. Interactive plots will not work.
    INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_dfdaf2b8ece8a02eb11f050ec701c0ec NOW.
    error: command ‘gcc’ failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/local/bin/python -u -c ‘import sys, setuptools, tokenize; sys.argv[0] = ‘“’”‘/tmp/pip-install-l516b8ts/fbprophet_80d5f400081541a2bf6ee26d2785e363/setup.py’“‘”’; __file__=‘“’”‘/tmp/pip-install-l516b8ts/fbprophet_80d5f400081541a2bf6ee26d2785e363/setup.py’“‘”’;f=getattr(tokenize, ‘“’”‘open’“‘”’, open)(__file__);code=f.read().replace(‘“’”‘\r\n’“‘”’, ‘“’”‘\n’“‘”’);f.close();exec(compile(code, __file__, ‘“’”‘exec’“‘”’))' install --record /tmp/pip-record-7n8tvfkb/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.8/fbprophet Check the logs for full command output.

Note that running the two commands I provided always produce errors, but they are not critical. Upgrading setuptools and installing the dependencies before fbprophet solves those minor errors. The error shown above is different, related to gcc, and only happens to some people.

Optional additional questions:

  • How do we fix it?
  • How do we prevent non-reproducible results like this one?
  • Can upgrading the docker engine version break a container?
German Capuano
  • 5,183
  • 6
  • 23
  • 35
  • 2
    Difference of docker version might not affects this problem. Because it appears my environment, docker version 19.03.13, Ubuntu 20.04. – Akihito KIRISAKI Dec 19 '20 at 05:57
  • If someone has a better description for the title of the question, please let me know. I think it needs improvement. – German Capuano Dec 21 '20 at 17:47
  • 2
    If you have nothing you care of on your docker you should try a `docker rm -f $(docker ps -aq) && docker system prune --all --volumes`. Then try it again on your Mac. I do see the same behaviour as your colleague on my side (Big Sur with latest docker daemon) – β.εηοιτ.βε Dec 21 '20 at 23:36
  • Here is the log from running the given command on a fresh docker if anyone wants to take a quick look at it. http://dpaste.com//C5V8QKQLA – BcK Dec 22 '20 at 00:05
  • 1
    Perhaps the image for this tag changed. Can you compare your two `docker images` results? – Danny Varod Dec 22 '20 at 00:24
  • @β.εηοιτ.βε thank you. He tried those but it didn't work :( – German Capuano Dec 22 '20 at 14:41
  • @DannyVarod, is that different than checking the docker image IDs? Sorry, I'm not sure I follow. – German Capuano Dec 22 '20 at 14:41
  • 2
    @GermanCapuano the same as checking IDs, different than checking tags, as tag can be updated to a new ID. – Danny Varod Dec 22 '20 at 14:56
  • 1
    @GermanCapuano Your colleague should do nothing, he does have the correct behaviour. **You** should run those commands – β.εηοιτ.βε Dec 22 '20 at 14:58
  • @GermanCapuano, could you pls accept the answer if it helped :) – Olesya Bolobova Dec 24 '20 at 21:59
  • @OlesyaBolobova, yes, I just waited a few days to see if anyone wanted to comment on it or disagree on something. No one did, so I accepted it and gave you the bounty. Thank you :) – German Capuano Dec 25 '20 at 22:05

2 Answers2

13

How do we fix it?

Your error reports a GCC / compilation problem.
A quick search shows mostly problems related to python / gcc version (one, two, three).
But you are right, this doesn't look like as it could happen inside a one particular container.

What it does look like is some kind of OOM problem.

Also, is this a VM? Stan requires a significant amount of memory to compile the models, and this error can occur if you run out of RAM while it is compiling.

I did a bit of testing.
On my machine the compilation process consumed up to 2.4 Gb of RAM.

cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)

uname -r
3.10.0-1160.6.1.el7.x86_64

docker --version
Docker version 20.10.1, build 831ebea

# works fine
docker run --rm -it -m 3G python:3.8.6 /bin/bash

# fails with error: command 'gcc' failed with exit status 1
# actually it was killed by OOM killer
docker run --rm -it -m 2G python:3.8.6 /bin/bash

# yes, here he is
tail -f /var/log/messages | grep -i 'killed process'
Dec 22 08:34:09 cent7-1 kernel: Killed process 5631 (cc1plus), UID 0, total-vm:2073600kB, anon-rss:1962404kB, file-rss:15332kB, shmem-rss:0kB
Dec 22 08:35:56 cent7-1 kernel: Killed process 5640 (cc1plus), UID 0, total-vm:2056816kB, anon-rss:1947392kB, file-rss:15308kB, shmem-rss:0kB

Check OOM killer log on problematic machine.
Is there enough RAM available for Docker?


Can upgrading the docker engine version break a container?

Generally, it shouldn't be the case.
But for v20.10.0 Docker introduced a very big set of changes related to memory and cgroups.

After you rule out all obvious reasons (like your friend's machine just not having enough RAM), you might need to dig into your docker daemon settings related to memory / cgroups / etc.


How can the same container produce different results on two computers?

Well, technically it's quite possible.
Containerized programs still use host OS kernel.
Not all kernel settings are "namespaced", i. e. can be set exclusively for one particular container.
A lot of them (actually, most) are still global and can affect your program's behavior.

Though I don't think it's related to your problem.
But for complicated programs relying on specific kernel setting that must be taken into account.

Olesya Bolobova
  • 1,573
  • 1
  • 10
  • 21
  • Oh, wow!! It was the memory. He increased the memory and it worked!!!! It seems that by chance we were around the RAM limit, and the differences in docker were enough for one to fail and not the other. – German Capuano Dec 22 '20 at 15:13
1

This is solution. This problem is not just a matter on docker, but fbprophet itself causes. To avoid:

docker run -it python:3.8.6 /bin/bash
pip install numpy pandas blahblah...
pip install fbprophet
Akihito KIRISAKI
  • 1,243
  • 6
  • 12