9

I'm trying to set up a Redis cluster and I followed this guide here: https://rancher.com/blog/2019/deploying-redis-cluster/

Basically I'm creating a StatefulSet with a replica 6, so that I can have 3 master nodes and 3 slave nodes. After all the nodes are up, I create the cluster, and it all works fine... but if I look into the file "nodes.conf" (where the configuration of all the nodes should be saved) of each redis node, I can see it's empty. This is a problem, because whenever a redis node gets restarted, it searches into that file for the configuration of the node to update the IP address of itself and MEET the other nodes, but he finds nothing, so it basically starts a new cluster on his own, with a new ID.

My storage is an NFS connected shared folder. The YAML responsible for the storage access is this one:

kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: nfs-provisioner-raid5
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: nfs-provisioner-raid5
    spec:
      serviceAccountName: nfs-provisioner-raid5
      containers:
        - name: nfs-provisioner-raid5
          image: quay.io/external_storage/nfs-client-provisioner:latest
          volumeMounts:
            - name: nfs-raid5-root
              mountPath: /persistentvolumes
          env:
            - name: PROVISIONER_NAME
              value: 'nfs.raid5'
            - name: NFS_SERVER
              value: 10.29.10.100
            - name: NFS_PATH
              value: /raid5
      volumes:
        - name: nfs-raid5-root
          nfs:
            server: 10.29.10.100
            path: /raid5
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-provisioner-raid5
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs.raid5
provisioner: nfs.raid5
parameters:
  archiveOnDelete: "false"

This is the YAML of the redis cluster StatefulSet:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis-cluster
  labels:
    app: redis-cluster
spec:
  serviceName: redis-cluster
  replicas: 6
  selector:
    matchLabels:
      app: redis-cluster
  template:
    metadata:
      labels:
        app: redis-cluster
    spec:
      containers:
      - name: redis
        image: redis:5-alpine
        ports:
        - containerPort: 6379
          name: client
        - containerPort: 16379
          name: gossip
        command: ["/conf/fix-ip.sh", "redis-server", "/conf/redis.conf"]
        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - "redis-cli -h $(hostname) ping"
          initialDelaySeconds: 15
          timeoutSeconds: 5
        livenessProbe:
          exec:
            command:
            - sh
            - -c
            - "redis-cli -h $(hostname) ping"
          initialDelaySeconds: 20
          periodSeconds: 3
        env:
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        volumeMounts:
        - name: conf
          mountPath: /conf
          readOnly: false
        - name: data
          mountPath: /data
          readOnly: false
      volumes:
      - name: conf
        configMap:
          name: redis-cluster
          defaultMode: 0755
  volumeClaimTemplates:
  - metadata:
      name: data
      labels:
        name: redis-cluster
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: nfs.raid5
      resources:
        requests:
          storage: 1Gi

This is the configMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: redis-cluster
  labels:
    app: redis-cluster
data:
  fix-ip.sh: |
    #!/bin/sh
    CLUSTER_CONFIG="/data/nodes.conf"
    echo "creating nodes"
    if [ -f ${CLUSTER_CONFIG} ]; then
      if [ -z "${POD_IP}" ]; then
        echo "Unable to determine Pod IP address!"
        exit 1
      fi
      echo "Updating my IP to ${POD_IP} in ${CLUSTER_CONFIG}"
      sed -i.bak -e "/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/${POD_IP}/" ${CLUSTER_CONFIG}
      echo "done"
    fi
    exec "$@"
  redis.conf: |+
    cluster-enabled yes
    cluster-require-full-coverage no
    cluster-node-timeout 15000
    cluster-config-file /data/nodes.conf
    cluster-migration-barrier 1
    appendonly yes
    protected-mode no

and I created the cluster using the command:

kubectl exec -it redis-cluster-0 -- redis-cli --cluster create --cluster-replicas 1 $(kubectl get pods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:6379 ')

what am I doing wrong? this is what I see into the /data folder: enter image description here

the nodes.conf file shows 0 bytes.

Lastly, this is the log from the redis-cluster-0 pod:

creating nodes
1:C 07 Nov 2019 13:01:31.166 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 07 Nov 2019 13:01:31.166 # Redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 07 Nov 2019 13:01:31.166 # Configuration loaded
1:M 07 Nov 2019 13:01:31.179 * No cluster configuration found, I'm e55801f9b5d52f4e599fe9dba5a0a1e8dde2cdcb
1:M 07 Nov 2019 13:01:31.182 * Running mode=cluster, port=6379.
1:M 07 Nov 2019 13:01:31.182 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 07 Nov 2019 13:01:31.182 # Server initialized
1:M 07 Nov 2019 13:01:31.182 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 07 Nov 2019 13:01:31.185 * Ready to accept connections
1:M 07 Nov 2019 13:08:04.264 # configEpoch set to 1 via CLUSTER SET-CONFIG-EPOCH
1:M 07 Nov 2019 13:08:04.306 # IP address for this node updated to 10.40.0.27
1:M 07 Nov 2019 13:08:09.216 # Cluster state changed: ok
1:M 07 Nov 2019 13:08:10.144 * Replica 10.44.0.14:6379 asks for synchronization
1:M 07 Nov 2019 13:08:10.144 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '27972faeb07fe922f1ab581cac0fe467c85c3efd', my replication IDs are '31944091ef93e3f7c004908e3ff3114fd733ea6a' and '0000000000000000000000000000000000000000')
1:M 07 Nov 2019 13:08:10.144 * Starting BGSAVE for SYNC with target: disk
1:M 07 Nov 2019 13:08:10.144 * Background saving started by pid 1041
1041:C 07 Nov 2019 13:08:10.161 * DB saved on disk
1041:C 07 Nov 2019 13:08:10.161 * RDB: 0 MB of memory used by copy-on-write
1:M 07 Nov 2019 13:08:10.233 * Background saving terminated with success
1:M 07 Nov 2019 13:08:10.243 * Synchronization with replica 10.44.0.14:6379 succeeded

thank you for the help.

Deboroh88
  • 391
  • 1
  • 6
  • 20
  • what is your environment? minikube? GKE? – Matt Oct 29 '19 at 11:27
  • It's actually a Kubernetes cluster with 4 nodes, one of them is the master. It's a cluster we setup internally, starting from this guide: https://github.com/hobby-kube/guide. For the storage though we use a shared folder in NFS. – Deboroh88 Oct 29 '19 at 14:42
  • Can you add to your answer part of your stetefulset yaml that is responsible for mounting nfs volume? And preferably every other information that may be helpful while replicating your issue? – Matt Oct 30 '19 at 10:20
  • ok done. If you need anything else, I'll be glad to post it :) – Deboroh88 Oct 31 '19 at 10:36
  • I need the content of `fix-ip.sh`. Can you add *redis-cluster* configmap ? – Matt Oct 31 '19 at 11:15
  • I replicated your issue on GKE using Filestore as NFS server and everything works fine. `nodes.conf` file contains list of all nodes. Can you try manually creating any file with any content in `/data` dir? maybe there is problem with storing data on your nfs. – Matt Nov 04 '19 at 09:25
  • but I can see in that folder other files, like the appendonly.aof file that is changing in time... in fact also if I try to create and edit a file via VI, it works. – Deboroh88 Nov 04 '19 at 09:50
  • Try deleting statefulset (`kubectl delete sts redis-cluster`) and pvc (`kubectl delete pvc -l app=redis-cluster`) and then recreate it. Also look into redis logs, maybe it shows something, maybe redis has some problem with opening `nodes.conf` file (also change *loglevel* to *debug*) – Matt Nov 04 '19 at 11:49
  • 2
    I find it strange that you are updating the conf this way. A good feature of stateful set is [POD identity](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-identity). You could use it to setup config once with pod domain names, it won't matter if pods go up or down. – Ezwig Nov 05 '19 at 14:17
  • 1
    shouldn't all the redis nodes have a shared node.conf? Since you are using a volumetemplate, you are getting a new volume for each pod. Since it is NFS and likely has the same data for each pod, you can create a normal volume and volume mount instead of the template – Patrick W Nov 05 '19 at 16:48
  • @HelloWorld I already did it, and the logs show NO error in opening the file... in the logs all is going ok. – Deboroh88 Nov 06 '19 at 15:47
  • My files are very similar to yours except that my redis.conf states `cluster-config-file nodes.conf` (the /data/ does not appear) I also use the fix-ip.sh and the path is the same `CLUSTER_CONFIG="/data/nodes.conf"` – jbl Nov 06 '19 at 17:08
  • can you share the logs from redis-cluster-0 pod – P Ekambaram Nov 07 '19 at 12:37
  • @jbl I did the same now, but the file is still empty. – Deboroh88 Nov 07 '19 at 13:17
  • @PEkambaram done, you can now see the logs in the question. – Deboroh88 Nov 07 '19 at 13:17
  • @Deboroh88, can you please post also the security context and serviceAccount of one of your redis' Pods (just to be sure that nothing mutates Kubernetes resources on your cluster before they are actually created). One more thing, check files/directory permissions of your NFS folder '/raid5' directly on NFS Server. – Nepomucen Nov 07 '19 at 16:48
  • Idid you try with the updated script – P Ekambaram Nov 07 '19 at 17:59
  • can you use fix-ip.sh script from my answer below and confirm. it should work – P Ekambaram Nov 08 '19 at 07:27
  • @Nepomucen the serviceAccount is the default one, that is an ADMIN (has full access to everything). The permission on the shared disk is "Everyone -> full control". – Deboroh88 Nov 08 '19 at 10:52
  • @Deboroh88, I think we reached sort of deadpoint trying to help you out with your unique case. Lot of users, including me (Kubernetes v1.15.2 GCE, with in-cluster NFS server) cannot reproduce your case = works w/o problem. At this place I would recommend you in the route of elimination to try to find a culprit. For me the main suspect is NFS based storage. Please replace it with [in-cluster solution](https://github.com/kubernetes/examples/tree/master/staging/volumes/nfs), and recreate your Redis cluster setup. BTW. you put to few info about your Kubernetes cluster. – Nepomucen Nov 12 '19 at 10:34

1 Answers1

4

Looks to be an issue with the shell script that is mounted from configmap. can you update as below

  fix-ip.sh: |
    #!/bin/sh
    CLUSTER_CONFIG="/data/nodes.conf"
    echo "creating nodes"
    if [ -f ${CLUSTER_CONFIG} ]; then
      echo "[ INFO ]File:${CLUSTER_CONFIG} is Found"
    else
      touch $CLUSTER_CONFIG
    fi
    if [ -z "${POD_IP}" ]; then
      echo "Unable to determine Pod IP address!"
      exit 1
    fi
    echo "Updating my IP to ${POD_IP} in ${CLUSTER_CONFIG}"
    sed -i.bak -e "/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/${POD_IP}/" ${CLUSTER_CONFIG}
    echo "done"
    exec "$@"

I just deployed with the updated script and it worked. see below the output

master $ kubectl get po
NAME              READY   STATUS    RESTARTS   AGE
redis-cluster-0   1/1     Running   0          83s
redis-cluster-1   1/1     Running   0          54s
redis-cluster-2   1/1     Running   0          45s
redis-cluster-3   1/1     Running   0          38s
redis-cluster-4   1/1     Running   0          31s
redis-cluster-5   1/1     Running   0          25s
master $ kubectl exec -it redis-cluster-0 -- redis-cli --cluster create --cluster-replicas 1 $(kubectl getpods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:6379 ')
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 10.40.0.4:6379 to 10.40.0.1:6379
Adding replica 10.40.0.5:6379 to 10.40.0.2:6379
Adding replica 10.40.0.6:6379 to 10.40.0.3:6379
M: 9984141f922bed94bfa3532ea5cce43682fa524c 10.40.0.1:6379
   slots:[0-5460] (5461 slots) master
M: 76ebee0dd19692c2b6d95f0a492d002cef1c6c17 10.40.0.2:6379
   slots:[5461-10922] (5462 slots) master
M: 045b27c73069bff9ca9a4a1a3a2454e9ff640d1a 10.40.0.3:6379
   slots:[10923-16383] (5461 slots) master
S: 1bc8d1b8e2d05b870b902ccdf597c3eece7705df 10.40.0.4:6379
   replicates 9984141f922bed94bfa3532ea5cce43682fa524c
S: 5b2b019ba8401d3a8c93a8133db0766b99aac850 10.40.0.5:6379
   replicates 76ebee0dd19692c2b6d95f0a492d002cef1c6c17
S: d4b91700b2bb1a3f7327395c58b32bb4d3521887 10.40.0.6:6379
   replicates 045b27c73069bff9ca9a4a1a3a2454e9ff640d1a
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
....
>>> Performing Cluster Check (using node 10.40.0.1:6379)
M: 9984141f922bed94bfa3532ea5cce43682fa524c 10.40.0.1:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: 045b27c73069bff9ca9a4a1a3a2454e9ff640d1a 10.40.0.3:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 1bc8d1b8e2d05b870b902ccdf597c3eece7705df 10.40.0.4:6379
   slots: (0 slots) slave
   replicates 9984141f922bed94bfa3532ea5cce43682fa524c
S: d4b91700b2bb1a3f7327395c58b32bb4d3521887 10.40.0.6:6379
   slots: (0 slots) slave
   replicates 045b27c73069bff9ca9a4a1a3a2454e9ff640d1a
M: 76ebee0dd19692c2b6d95f0a492d002cef1c6c17 10.40.0.2:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 5b2b019ba8401d3a8c93a8133db0766b99aac850 10.40.0.5:6379
   slots: (0 slots) slave
   replicates 76ebee0dd19692c2b6d95f0a492d002cef1c6c17
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

master $ kubectl exec -it redis-cluster-0 -- redis-cli cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:61
cluster_stats_messages_pong_sent:76
cluster_stats_messages_sent:137
cluster_stats_messages_ping_received:71
cluster_stats_messages_pong_received:61
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:137

master $ for x in $(seq 0 5); do echo "redis-cluster-$x"; kubectl exec redis-cluster-$x -- redis-cli role;echo; done
redis-cluster-0
master
588
10.40.0.4
6379
588

redis-cluster-1
master
602
10.40.0.5
6379
602

redis-cluster-2
master
588
10.40.0.6
6379
588

redis-cluster-3
slave
10.40.0.1
6379
connected
602

redis-cluster-4
slave
10.40.0.2
6379
connected
602

redis-cluster-5
slave
10.40.0.3
6379
connected
588
P Ekambaram
  • 15,499
  • 7
  • 34
  • 59
  • unfortunately, it still doesn't work. I mean, if I run the commands that you do here, I can see the same log, because the cluster gets created normally, but the file is still empty. I guess it's a problem with the file writing on disk at this point... that is weird because on that folder I still see that the "appendonly.aof" and "dump.rdb" files have data on them. – Deboroh88 Nov 08 '19 at 10:49
  • You might be right, the container might not have permission to write to the mount. In that case write the file to different location In the container and use that location for starting up redis server – P Ekambaram Nov 08 '19 at 11:11
  • ok, it seems a good test, but to make a node able to re-meet the cluster once it gets restarted I need the file to be permanently stored, right? – Deboroh88 Nov 08 '19 at 11:16
  • Give it a try and check – P Ekambaram Nov 08 '19 at 11:31
  • Another way is to use init container to update permissions on data folder to 777. Then redis container should be able to write. You need to share the data folder between init container and redis container. I tried this way few years ago when I run into silimilar issue – P Ekambaram Nov 08 '19 at 11:34
  • indeed putting the file in another folder have worked. What do you mean by "init container"? Should I modify the "fix-ip" script to change permissions on the folder? Would it work, if I'm inside the container itself? – Deboroh88 Nov 08 '19 at 11:39
  • good that it is worked with different location. if you want to try another way try with init container to update the permissions on the data volume. refer the link for init container https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ – P Ekambaram Nov 08 '19 at 12:02
  • mmm that doesn't work too... there is something strange, but maybe it's normal: if I get up the redis-cluster with the nodes.conf file on another folder, it gets written... and in the /data folder I can write manually all the files I need (from within the container). But when I start the container with the nodes.conf inside the /data folder, even if I try to edit the file manually, the data inside continues to vanish. I guess it's the redis process that erases the data...? – Deboroh88 Nov 08 '19 at 15:34
  • But when I tested yday I didn't face any problem. Redis cluster came up fine – P Ekambaram Nov 08 '19 at 16:44
  • Something to do with your nfs volume. Instead of nfs test with emptyDir and see if it works. I used emptyDir and it worked fine – P Ekambaram Nov 08 '19 at 16:47