8

I've gone through the Azure Cats&Dogs tutorial described here and I am getting an error in the final step where the apps are launched in AKS. Kubernetes is reporting that I have insufficent pods but I'm not sure why this would be. I've run through this same tutorial a few weeks ago without problems.

$ kubectl apply -f azure-vote-all-in-one-redis.yaml
deployment.apps/azure-vote-back created
service/azure-vote-back created
deployment.apps/azure-vote-front created
service/azure-vote-front created

$ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
azure-vote-back-655476c7f7-mntrt    0/1     Pending   0          6s
azure-vote-front-7c7d7f6778-mvflj   0/1     Pending   0          6s

$ kubectl get events
LAST SEEN   TYPE      REASON                 KIND         MESSAGE
3m36s       Warning   FailedScheduling       Pod          0/1 nodes are available: 1 Insufficient pods.
84s         Warning   FailedScheduling       Pod          0/1 nodes are available: 1 Insufficient pods.
70s         Warning   FailedScheduling       Pod          skip schedule deleting pod: default/azure-vote-back-655476c7f7-l5j28
9s          Warning   FailedScheduling       Pod          0/1 nodes are available: 1 Insufficient pods.
53m         Normal    SuccessfulCreate       ReplicaSet   Created pod: azure-vote-back-655476c7f7-kjld6
99s         Normal    SuccessfulCreate       ReplicaSet   Created pod: azure-vote-back-655476c7f7-l5j28
24s         Normal    SuccessfulCreate       ReplicaSet   Created pod: azure-vote-back-655476c7f7-mntrt
53m         Normal    ScalingReplicaSet      Deployment   Scaled up replica set azure-vote-back-655476c7f7 to 1
99s         Normal    ScalingReplicaSet      Deployment   Scaled up replica set azure-vote-back-655476c7f7 to 1
24s         Normal    ScalingReplicaSet      Deployment   Scaled up replica set azure-vote-back-655476c7f7 to 1
9s          Warning   FailedScheduling       Pod          0/1 nodes are available: 1 Insufficient pods.
3m36s       Warning   FailedScheduling       Pod          0/1 nodes are available: 1 Insufficient pods.
53m         Normal    SuccessfulCreate       ReplicaSet   Created pod: azure-vote-front-7c7d7f6778-rmbqb
24s         Normal    SuccessfulCreate       ReplicaSet   Created pod: azure-vote-front-7c7d7f6778-mvflj
53m         Normal    ScalingReplicaSet      Deployment   Scaled up replica set azure-vote-front-7c7d7f6778 to 1
53m         Normal    EnsuringLoadBalancer   Service      Ensuring load balancer
52m         Normal    EnsuredLoadBalancer    Service      Ensured load balancer
46s         Normal    DeletingLoadBalancer   Service      Deleting load balancer
24s         Normal    ScalingReplicaSet      Deployment   Scaled up replica set azure-vote-front-7c7d7f6778 to 1

$ kubectl get nodes
NAME                       STATUS   ROLES   AGE    VERSION
aks-nodepool1-27217108-0   Ready    agent   7d4h   v1.9.9

The only thing I can think of that has changed is that I have other (larger) clusters running now as well, and the main reason I went through this Cats&Dogs tutorial again was because I hit this same problem today with my other clusters. Is this a resources limit issue with my Azure account?

Update 10-20/3:15 PST: Notice how these three clusters all show that they use the same nodepool, even though they were created in different resource groups. Also note how the "get-credentials" call for gem2-cluster reports an error. I did have a cluster earlier called gem2-cluster which I deleted and recreated using the same name (in fact I deleted the wole resource group). What's the correct process for doing this?

$ az aks get-credentials --name gem1-cluster --resource-group gem1-rg
Merged "gem1-cluster" as current context in /home/psteele/.kube/config

$ kubectl get nodes -n gem1
NAME                       STATUS   ROLES   AGE     VERSION
aks-nodepool1-27217108-0   Ready    agent   3h26m   v1.9.11

$ az aks get-credentials --name gem2-cluster --resource-group gem2-rg
A different object named gem2-cluster already exists in clusters

$ az aks get-credentials --name gem3-cluster --resource-group gem3-rg
Merged "gem3-cluster" as current context in /home/psteele/.kube/config

$ kubectl get nodes -n gem1
NAME                       STATUS   ROLES   AGE   VERSION
aks-nodepool1-14202150-0   Ready    agent   26m   v1.9.11

$ kubectl get nodes -n gem2
NAME                       STATUS   ROLES   AGE   VERSION
aks-nodepool1-14202150-0   Ready    agent   26m   v1.9.11

$ kubectl get nodes -n gem3
NAME                       STATUS   ROLES   AGE   VERSION
aks-nodepool1-14202150-0   Ready    agent   26m   v1.9.11
user3280383
  • 465
  • 2
  • 6
  • 17
  • paste output from `kubectl get nodes`? – 4c74356b41 Oct 19 '18 at 18:29
  • Done. I see that the age here shows 7 days. That doesn't look right. – user3280383 Oct 19 '18 at 19:40
  • I say it doesn't look right because I just created the cluster today and I'd expect that age column to reflect that. – user3280383 Oct 19 '18 at 19:59
  • 1
    the error suggest the node is offline. can you check the portal to see if its actually running? – 4c74356b41 Oct 19 '18 at 20:33
  • The issue I believe is my confusion about the role that 'az aks get-credentials' plays. I forgot to do that command when I created my Cats&Dogs cluster and so operations were targeted to a cluster that I had created previously. I had in fact two large clusters and it does seem I've exhausted my reources. – user3280383 Oct 20 '18 at 21:58
  • So the question is, is there a way to create separate clusters with their own nodepools? – user3280383 Oct 20 '18 at 22:15
  • this is a separate question, but yes, you can do that. just create a new cluster with az aks create (or any other convenient way) – 4c74356b41 Oct 21 '18 at 06:48
  • Yes, that is a separate question. I consider the original answered. Thanks for the help. – user3280383 Oct 21 '18 at 13:24

3 Answers3

29

What is your max-pods set to? This is a normal error when you've reached the limit of pods per node.

You can check your current maximum number of pods per node with:

$ kubectl get nodes -o yaml | grep pods
  pods: "30"
  pods: "30"

And your current with:

$ kubectl get pods --all-namespaces | grep Running | wc -l
  18
Lipsum
  • 536
  • 3
  • 6
  • 4
    worth checking without the `grep Running` command as this one caught me out. I had about 700 pods in pending state due to cron job image pull failures. Thx. – quintindk Jan 17 '19 at 06:18
  • Is the max number of pods per node something that we can configure? – Stephen Mar 16 '19 at 21:54
  • 1
    @Stephen Yes, but currently it's only configurable during creation of the AKS cluster. And I think only via the command line. Look at the argument "--max-pods" under: "az aks create -h". – Lipsum Apr 02 '19 at 08:41
7

I hit this because I exceed the max pods, I found out how much I could handle by doing:

$ kubectl get nodes -o json | jq -r .items[].status.allocatable.pods | paste -sd+ - | bc
ActionJack
  • 71
  • 1
  • 3
  • 2
    kubectl get nodes -o json and locating items[].status.allocatable.pods worked. It was 4 and 4 pods were already running for system namespace. – inaitgaJ Sep 06 '19 at 07:03
0

Check to make sure you are not hitting core limits for your subscription.

az vm list-usage --location "<location>" -o table

If you are you can request more quota, https://learn.microsoft.com/en-us/azure/azure-supportability/resource-manager-core-quotas-request

Ken W - Zero Networks
  • 3,533
  • 1
  • 13
  • 18