Insufficient CPU despite autoscaling

Question

I have a simple Node.js service. Even tough I get an Insufficient cpu warning, it eventually works. I think this is because I am using autopilot and new nodes are provided automatically.

Starting deploy...
 - deployment.apps/auth-depl configured
 - service/auth-srv configured
 - ingress.networking.k8s.io/ingress-service unchanged
Waiting for deployments to stabilize...
 - deployment/auth-depl: 0/3 nodes are available: 3 Insufficient cpu, 3 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
    - pod/auth-depl-f746f9cdd-pckr9: 0/3 nodes are available: 3 Insufficient cpu, 3 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
 - deployment/auth-depl: creating container auth-pod
    - pod/auth-depl-f746f9cdd-pckr9: creating container auth-pod
 - deployment/auth-depl: waiting for rollout to finish: 1 old replicas are pending termination...
 - deployment/auth-depl is ready.
Deployments stabilized in 2 minutes 15.209 seconds

I am confused by why CPU limits are being reached when I have a very simple application that returns 'hi' to a GET / request. full code

If the pod is scaled up, shouldn't the warning stop appearing on subsequent builds?

When I add a MongoDB deployment, it works the first time (after deleting the current mongo deployment).

#auth-mongo-depl.yml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongo-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Mi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: auth-mongo-depl
spec:
  selector:
    matchLabels:
      app: auth-mongo-pod-label
  template:
    metadata:
      labels:
        app: auth-mongo-pod-label
    spec:
      containers:
        - name: mongo
          image: mongo
          ports:
            - containerPort: 27017
          volumeMounts:
            - name: storage
              mountPath: /data/db
      volumes:
        - name: storage
          persistentVolumeClaim:
            claimName: mongo-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: auth-mongo-srv
spec:
  selector:
    app: auth-mongo-pod-label
  ports:
    - protocol: TCP
      port: 27017
      targetPort: 27017

Starting deploy...
 - deployment.apps/auth-depl configured
 - service/auth-srv created
 - persistentvolumeclaim/mongo-pvc unchanged
 - Warning: Autopilot set default resource requests for Deployment default/auth-mongo-depl, as resource requests were not specified. See http://g.co/gke/autopilot-defaults
 - deployment.apps/auth-mongo-depl created
 - service/auth-mongo-srv created
 - ingress.networking.k8s.io/ingress-service unchanged
Waiting for deployments to stabilize...
 - deployment/auth-depl: 0/3 nodes are available: 3 Insufficient cpu, 3 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
    - pod/auth-depl-7cf7878d76-t666t: 0/3 nodes are available: 3 Insufficient cpu, 3 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
 - deployment/auth-mongo-depl: 0/3 nodes are available: 3 Insufficient cpu, 3 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
    - pod/auth-mongo-depl-8f5997d87-5gqwd: 0/3 nodes are available: 3 Insufficient cpu, 3 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
 - deployment/auth-depl is ready. [1/2 deployment(s) still pending]
 - deployment/auth-mongo-depl is ready.
Deployments stabilized in 2 minutes 35.184 seconds
You can also run [skaffold run --tail] to get the logs

Help improve Skaffold with our 2-minute anonymous survey: run 'skaffold survey'
To help improve the quality of this product, we collect anonymized usage data for details on what is tracked and how we use this data visit <https://skaffold.dev/docs/resources/telemetry/>. This data is handled in accordance with our privacy policy <https://policies.google.com/privacy>

You may choose to opt out of this collection by running the following command:
        skaffold config set --global collect-metrics false
PUSH
DONE
--------------------------------------------------------------------------------------------------------------------------------------------
ID                                    CREATE_TIME                DURATION  SOURCE                                                           
                                 IMAGES  STATUS
5ae8bbcc-ab76-4994-b700-af0d7c99598f  2023-03-15T16:02:35+00:00  3M12S     gs://disco-ethos-380514_cloudbuild/source/1678895981.530669-3ad4d45463f54be2954ab1cfe3387583.tgz  -       SUCCESS

I am not sure why the scheduling would fail 3 times and attaching the volume would fail at first.

(I just realized the "does not contain driver" error, but the documentation says it should be enabled.

The Compute Engine persistent disk CSI Driver is always enabled in Autopilot clusters and can't be disabled or edited. - documentation

)

However, for the next build, it doesn't work.

Starting deploy...
 - deployment.apps/auth-depl configured
 - service/auth-srv configured
 - persistentvolumeclaim/mongo-pvc unchanged
 - deployment.apps/auth-mongo-depl configured
 - service/auth-mongo-srv configured
 - ingress.networking.k8s.io/ingress-service unchanged
Waiting for deployments to stabilize...
 - deployment/auth-depl: 0/4 nodes are available: 4 Insufficient cpu, 4 Insufficient memory. preemption: 0/4 nodes are available: 4 No preemption victims found for incoming pod.
    - pod/auth-depl-6fd9ffcbb4-rrbdp: 0/4 nodes are available: 4 Insufficient cpu, 4 Insufficient memory. preemption: 0/4 nodes are available: 4 No preemption victims found for incoming pod.
 - deployment/auth-mongo-depl: 0/4 nodes are available: 4 Insufficient cpu, 4 Insufficient memory. preemption: 0/4 nodes are available: 4 No preemption victims found for incoming pod.
    - pod/auth-mongo-depl-79b5cfb9d7-z99xv: 0/4 nodes are available: 4 Insufficient cpu, 4 Insufficient memory. preemption: 0/4 nodes are available: 4 No preemption victims found for incoming pod.
 - deployment/auth-depl: creating container auth-pod
    - pod/auth-depl-6fd9ffcbb4-rrbdp: creating container auth-pod
 - deployment/auth-mongo-depl: Unschedulable: 0/1 nodes available: 1 node is not ready
    - pod/auth-mongo-depl-79b5cfb9d7-z99xv: Unschedulable: 0/1 nodes available: 1 node is not ready
 - deployment/auth-depl is ready. [1/2 deployment(s) still pending]
 - deployment/auth-mongo-depl: creating container mongo
    - pod/auth-mongo-depl-79b5cfb9d7-z99xv: creating container mongo
 - deployment/auth-mongo-depl failed. Error: creating container mongo.
1/2 deployment(s) failed
ERROR
ERROR: build step 0 "gcr.io/k8s-skaffold/skaffold:v2.2.0" failed: step exited with non-zero status: 1
--------------------------------------------------------------------------------------------------------------------------------------------

BUILD FAILURE: Build step failure: build step 0 "gcr.io/k8s-skaffold/skaffold:v2.2.0" failed: step exited with non-zero status: 1
ERROR: (gcloud.builds.submit) build 16b405f8-bce8-4851-ad9b-d67974fc39ae completed with status "FAILURE"

The second pod is stuck in "ContainerCreating".

Based on this answer, failing to mount was a bug. I also found this, but I am still confused why it works for the first build and not on subsequent ones.

The other answers I find such as this one involve changing some settings, but I have auto scaling enabled to hide this complexity.

I doubt I am reaching any quota limits. The documentation about PVCs doesn't mention enabling something manually.

Edit: It seems the issue is with ReadWriteOnce as detailed here. Many pods can read while only one can write, but this is all within a single node. During the building phase, the same node cannot be reused because of taint, which leads to two nodes accessing the same volume. I tried using ReadWriteMany.

Starting deploy...
 - deployment.apps/auth-depl configured
 - service/auth-srv configured
 - persistentvolumeclaim/auth-mongo-pvc created
 - Warning: Autopilot set default resource requests for Deployment default/auth-mongo-depl, as resource requests were not specified. See http://g.co/gke/autopilot-defaults
 - deployment.apps/auth-mongo-depl created
 - service/auth-mongo-srv created
 - deployment.apps/react-client-depl configured
 - service/react-client-srv configured
 - ingress.networking.k8s.io/ingress-service configured
Waiting for deployments to stabilize...
 - deployment/auth-depl: 0/4 nodes are available: 4 Insufficient cpu, 4 Insufficient memory. preemption: 0/4 nodes are available: 4 No preemption victims found for incoming pod.
    - pod/auth-depl-55b48d8f5b-fzq9c: 0/4 nodes are available: 4 Insufficient cpu, 4 Insufficient memory. preemption: 0/4 nodes are available: 4 No preemption victims found for incoming pod.
 - deployment/auth-mongo-depl: 0/4 nodes are available: 4 Insufficient cpu, 4 Insufficient memory. preemption: 0/4 nodes are available: 4 No preemption victims found for incoming pod.
    - pod/auth-mongo-depl-6c8cc4dd4c-jdzxw: 0/4 nodes are available: 4 Insufficient cpu, 4 Insufficient memory. preemption: 0/4 nodes are available: 4 No preemption victims found for incoming pod.
 - deployment/react-client-depl: 0/4 nodes are available: 4 Insufficient cpu, 4 Insufficient memory. preemption: 0/4 nodes are available: 4 No preemption victims found for incoming pod.
    - pod/react-client-depl-59d7d8fd7-drlhg: 0/4 nodes are available: 4 Insufficient cpu, 4 Insufficient memory. preemption: 0/4 nodes are available: 4 No preemption victims found for incoming pod.
 - deployment/react-client-depl: Unschedulable: 0/1 nodes available: 1 node is not ready
    - pod/react-client-depl-59d7d8fd7-drlhg: Unschedulable: 0/1 nodes available: 1 node is not ready
 - deployment/auth-depl is ready. [2/3 deployment(s) still pending]
 - deployment/react-client-depl is ready. [1/3 deployment(s) still pending]
 - deployment/auth-mongo-depl: Unschedulable: 0/1 nodes available: 1 node is not ready
    - pod/auth-mongo-depl-6c8cc4dd4c-jdzxw: Unschedulable: 0/1 nodes available: 1 node is not ready
 - deployment/auth-mongo-depl failed. Error: Unschedulable: 0/1 nodes available: 1 node is not ready.
1/3 deployment(s) failed
ERROR
ERROR: build step 0 "gcr.io/k8s-skaffold/skaffold:v2.2.0" failed: step exited with non-zero status: 1

I would suggest contacting the Support team to look into your project for further investigation. — Dion V, Mar 15 '23 at 21:57

BPDev · Answer 1 · 2023-03-17T23:32:30.117

0

For a quick fix, do not use rolling updates.

spec:
  strategy:
    type: Recreate

details

edited Mar 17 '23 at 23:32

answered Mar 16 '23 at 15:08

BPDev

397
1
9

Insufficient CPU despite autoscaling

1 Answers1