3

I've been using GKE to deploy some public images, such as redis and postgres. But I've been running into an interesting problem where it does not pull images, seemingly with specific tags. The error I keep getting is:

Failed to pull image "postgres:alpine": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

I've been trying to discover a pattern in ones that work and ones that don't, seems that ones without any tags always work; Some examples of images that have worked:

  • redis:alpine
  • postgres

And ones that haven't:

  • postgres:alpine
  • postgres:12

I verified that I can pull all these images to my local machine using docker pull.

Here's an example deployment kube file that I used:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
  labels:
    app: postgres
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - image: postgres:alpine
          name: postgres
          ports:
            - containerPort: 5432
              name: postgres

I'm hoping I missed something very obvious. Cheers.

zoran119
  • 10,657
  • 12
  • 46
  • 88
Aert
  • 1,989
  • 2
  • 15
  • 17
  • Hi mario; no it's a public registry (docker hub); what makes you say it clearly says authentication is required? I checked the logs as you suggested, but no details other than a lot of `Error syncing pod..., skipping: failed to "StartContainer" for "postgres" with ImagePullBackOff: "Back-off pulling image \"postgres:alpine\""` – Aert Jun 18 '20 at 10:34
  • Let's remove those comments as my initial reasoning wasn't probably the most accurete ;) Looks like it's quite common problem and the solution might be quite simple (see: my answer). – mario Jun 19 '20 at 17:05
  • Btw. are you using standard **Container-Optimized OS** or something different for you **GKE nodes** ? – mario Jun 19 '20 at 17:09
  • Does this answer your question? [Error "Get https://registry-1.docker.io/v2/: net/http: request canceled" while building image](https://stackoverflow.com/questions/48056365/error-get-https-registry-1-docker-io-v2-net-http-request-canceled-while-b) – Black_Bacardi Jul 09 '20 at 10:02

1 Answers1

4

I'm hoping I missed something very obvious. Cheers.

I think you didn't miss anything and for sure anything that is obvious and can be easily pointed out in your configuration.

I searched for some info related with this issue and it turns out that it was already widely discussed e.g. here and different solutions were provided.

It was also reported on GitHub:

  1. https://github.com/docker/for-win/issues/611
  2. https://github.com/moby/moby/issues/32270

As well as on docker forums.

To sum up the findings:

  1. Looks like the issue in some cases might be related with using additional firewall or proxy. See also this post.
  2. Might be related with DNS settings and setting up 8.8.8.8 as your primary DNS usually solves the problem. Many people in different threads on GitHub reported this solution worked for them e.g. here or here.

Or even simply restarting docker might help ;)

The above mentioned problems might be actually more likely when dealing with on-premise kubernetes installation.

As to GKE, looks like similar problem was also reported. Comments in this public issue might suggest that the problem may occur also in some newer GKE versions.

I found that this is also described in official GKE docs but normally it happens and you get similar error message when you work with private clusters but it could suggest that even in standard GKE clusters the issue might be related with limited outbound connectivity with the public Internet.

mario
  • 9,858
  • 1
  • 26
  • 42
  • 1
    Brilliant! You're right, it was from pods in a private cluster and hence no public internet connectivity. Very confusing that some public images worked though, but that must be because they were still available from a cache somewhere I guess. – Aert Jun 20 '20 at 05:34
  • Thanks @Aert. Been pounding my head against the table trying to figure out how such error could be present on private GKE clusters until I realized that's the catch. – Lukas Aug 04 '22 at 06:36
  • I noticed near duplicate question, I answered there, it pretty much answers this as well. https://stackoverflow.com/a/76302975/2548914 – neoakris May 22 '23 at 04:17