3

I am using the following chart to deploy a Cassandra cluster to my gke cluster. https://github.com/k8ssandra/k8ssandra/tree/main/charts/k8ssandra

However, the statefulset stuck in state 1/2 (the cassandra container status is always unhealthy)

Here's my values.yaml

cassandra:
  auth:
    superuser: 
      secret: cassandra-admin-secret
  clusterName: cassandra-cluster
  version: "4.0.0"
  cassandraLibDirVolume:
    storageClass: standard
    size: 5Gi
  allowMultipleNodesPerWorker: true
  resources:
    requests:
      cpu: 500m
      memory: 2Gi
    limits:
      cpu: 500m
      memory: 2Gi
  datacenters:
  - name: dc1
    size: 1
    racks:
    - name: default 

stargate:
  enabled: true
  replicas: 1
  heapMB: 256
  cpuReqMillicores: 200
  cpuLimMillicores: 500

kube-prometheus-stack:
  enabled: False
NAME                                                READY   STATUS     RESTARTS   AGE
cassandra-cluster-dc1-default-sts-0          1/2     Running    0          77m

And then I describe the pod

Events:
  Type     Reason     Age                    From     Message
  ----     ------     ----                   ----     -------
  Warning  Unhealthy  2m11s (x478 over 81m)  kubelet  Readiness probe failed: HTTP probe failed with statuscode: 500

Finally, I print the log of the cassandra container.

INFO  [nioEventLoopGroup-2-2] 2022-04-20 11:09:35,711 Cli.java:617 - address=/10.12.11.58:51000 url=/api/v0/metadata/endpoints status=500 Internal Server Error
INFO  [nioEventLoopGroup-3-14] 2022-04-20 11:09:37,718 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO  [nioEventLoopGroup-2-1] 2022-04-20 11:09:37,720 Cli.java:617 - address=/10.12.11.58:51132 url=/api/v0/metadata/endpoints status=500 Internal Server Error
INFO  [nioEventLoopGroup-3-15] 2022-04-20 11:09:37,750 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO  [nioEventLoopGroup-2-2] 2022-04-20 11:09:37,750 Cli.java:617 - address=/10.12.11.1:48478 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO  [nioEventLoopGroup-3-16] 2022-04-20 11:09:39,741 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock

and the logs of server-system-logger

tail: cannot open '/var/log/cassandra/system.log' for reading: No such file or directory

How can I solve this problem? Thanks.

  • 1
    Hi, could you check the logs of the `server-system-logger` container? It'll show us if Cassandra is having issues on startup. – Alexander DEJANOVSKI Apr 20 '22 at 11:43
  • `kubectl logs cassandra-cluster-dc1-default-sts-0 -c server-system-logger` shows `tail: cannot open '/var/log/cassandra/system.log' for reading: No such file or directory` – user13118342 Apr 21 '22 at 01:32
  • Thanks, it looks like Cassandra hasn't started at all. Could you post the contents of `kubectl describe cassdc/dc1`? It looks like you're not setting a heap size for Cassandra. You should specify one that's at most half of the memory limit you've set. – Alexander DEJANOVSKI Apr 25 '22 at 05:56

1 Answers1

2

The message in the cassandra container says it should resolve itself once Cassandra is up and running which is correct.

Similarly no logs are available with the server-system-logger container until Cassandra has started and more precisely, not until the logging framework has initialized.

John Sanda
  • 86
  • 3