1

On one of my AWS server I have manually started a detached Docker container running willnorris/imageproxy. With no warning, it seems to go down after a few days, for no apparent (external) reason. I checked the container logs and the syslog and found nothing.

How can I find out what goes wrong (this happens every time)?

This is how I start it:

ubuntu@local:~ $ ssh ubuntu@my_aws_box
ubuntu@aws_box:~ $ docker run -dp 8081:8080 willnorris/imageproxy -addr 0.0.0.0:8080

Typically, this is what I do when it seems to have crashed:

ubuntu@aws_box:~$ docker ps -a
CONTAINER ID   IMAGE                                                              COMMAND                  CREATED        STATUS                    PORTS                                                                            NAMES
de63701bbc82   willnorris/imageproxy                                              "/app/imageproxy -ad…"   10 days ago    Exited (2) 7 days ago                                                                                      frosty_shockley

ubuntu@aws_box:~$ docker logs de63701bbc82
imageproxy listening on 0.0.0.0:8080
2021/08/04 00:46:42 error copying response: write tcp 172.17.0.2:8080->172.17.0.1:38568: write: broken pipe
2021/08/04 00:46:42 error copying response: write tcp 172.17.0.2:8080->172.17.0.1:38572: write: broken pipe
2021/08/04 01:29:18 invalid request URL: malformed URL "/jars": too few path segments
2021/08/04 01:29:18 invalid request URL: malformed URL "/service/extdirect": must provide absolute remote URL
2021/08/04 11:09:49 invalid request URL: malformed URL "/jars": too few path segments
2021/08/04 11:09:49 invalid request URL: malformed URL "/service/extdirect": must provide absolute remote URL
2021/08/04 13:04:33 error copying response: write tcp 172.17.0.2:8080->172.17.0.1:41036: write: broken pipe

As you can see, the logs tell me nothing of the crash and the only real thing I have to go by is the exit status: Exited (2) 7 days ago .

oligofren
  • 20,744
  • 16
  • 93
  • 180
  • 1
    Is this helpful ? https://stackoverflow.com/questions/38112968/how-to-know-the-reason-why-a-docker-container-exits – Amin Aug 14 '21 at 06:13
  • It absolutly was! Found out the docker daemon was killed by a cron job that upgraded dependencies. – oligofren Aug 14 '21 at 15:28

1 Answers1

1

As this exit seemed to originate outside of the container/Docker, I needed to find the right logs. A linked-to question (which essentially makes this a dupe) hinted to checking out journald on unix systems. Doing journald -u docker (essentially grepping the log for docker) showed that the Docker container was killed on August 6:

Aug 06 06:06:49 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:49.544825959Z" level=info msg="Processing signal 'terminated'"
Aug 06 06:06:49 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:49.836744355Z" level=info msg="ignoring event" container=de63701bbc828ca8bfcb895eeccae62bbda602d3be0508ceaf20fe76d7d018d5 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Aug 06 06:06:49 ip-192-168-3-117 containerd[885]: time="2021-08-06T06:06:49.837480333Z" level=info msg="shim disconnected" id=de63701bbc828ca8bfcb895eeccae62bbda602d3be0508ceaf20fe76d7d018d5
Aug 06 06:06:49 ip-192-168-3-117 containerd[885]: time="2021-08-06T06:06:49.840764380Z" level=warning msg="cleaning up after shim disconnected" id=de63701bbc828ca8bfcb895eeccae62bbda602d3be0508ceaf20fe76d7d018d5 namespace=moby
Aug 06 06:06:49 ip-192-168-3-117 containerd[885]: time="2021-08-06T06:06:49.840787254Z" level=info msg="cleaning up dead shim"
Aug 06 06:06:49 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:49.868008333Z" level=info msg="ignoring event" container=709e057de026ff11f783121c839c56938ea79dcd5965be1546cd6931beb5a903 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Aug 06 06:06:49 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:49.868091089Z" level=info msg="ignoring event" container=9219e652436aae8016145bf3e0681ff1bb7046f230338d8ab79f9ced9532e342 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Aug 06 06:06:49 ip-192-168-3-117 containerd[885]: time="2021-08-06T06:06:49.868916377Z" level=info msg="shim disconnected" id=9219e652436aae8016145bf3e0681ff1bb7046f230338d8ab79f9ced9532e342
A
Aug 06 06:06:51 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:51.068939160Z" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
Aug 06 06:06:51 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:51.069763813Z" level=info msg="Daemon shutdown complete"
Aug 06 06:06:51 ip-192-168-3-117 dockerd[1045]: time="2021-08-06T06:06:51.070022944Z" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd names
pace=plugins.moby
Aug 06 06:06:51 ip-192-168-3-117 systemd[1]: Stopped Docker Application Container Engine.
Aug 06 06:06:51 ip-192-168-3-117 systemd[1]: Starting Docker Application Container Engine...

Now, what killed it? To find that out, I needed to not filter out the preceding events, so I just did journald | grep 'Aug 06' and found these lines preceding the previous ones:

Aug 06 05:56:01 ip-192-168-3-117 systemd[1]: Starting Daily apt download activities...
Aug 06 05:56:11 ip-192-168-3-117 systemd[1]: Started Daily apt download activities.
Aug 06 06:06:39 ip-192-168-3-117 systemd[1]: Starting Daily apt upgrade and clean activities...
Aug 06 06:06:48 ip-192-168-3-117 systemd[1]: Reloading.
Aug 06 06:06:48 ip-192-168-3-117 systemd[1]: Starting Message of the Day...
Aug 06 06:06:48 ip-192-168-3-117 systemd[1]: Reloading.
Aug 06 06:06:49 ip-192-168-3-117 systemd[1]: Reloading.
Aug 06 06:06:49 ip-192-168-3-117 systemd[1]: Stopping Docker Application Container Engine...

So this was basically caused by a cron job that upgraded the Docker daemon and killed the old one! Since I did not have --restart=always, the container was not restarted after the daemon had respawned.

oligofren
  • 20,744
  • 16
  • 93
  • 180