0

how to migrate VM data from a disk to a Kubernetes cluster?

I have a VM with three disks attached mounted to it, each having data that needs to be migrated to a Kubernetes cluster to be attached to database services in statefulset.

This link https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/preexisting-pd does give me the way. But don't know how to use it or implement it with statefulsets so that one particular database resource (like Postgres) be able to use the same PV(created from one of those GCE persistent disks) and create multiple PVCs for new replicas.

Is the scenario I'm describing achievable? If yes how to implement it?

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch
spec:
  version: 6.8.12
  http:
    tls:
      selfSignedCertificate:
        disabled: true
  nodeSets:
  - name: default
    count: 3
    config:
      node.store.allow_mmap: false
      xpack.security.enabled: false
      xpack.security.authc:
          anonymous:
            username: anonymous
            roles: superuser
            authz_exception: false
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 50Gi
        storageClassName: managed-premium


devops-admin
  • 1,447
  • 1
  • 15
  • 26
  • You meant you want to run multiple Postgres instances (StatefulSet) and all instances share the **same** volume which hold the database files? – gohm'c Mar 06 '22 at 05:21
  • @gohm'c if possible, yes. In general, I want to know how do I migrate data in a disk mounted to a VM and use the same data (through cloning or something) in Kubernetes stateful-sets YAML files as PVC in HA mode for data services like postgres, elasticsearch. – devops-admin Mar 08 '22 at 09:46
  • Copy data is trivial, you can copy over with NFS, mount existing volume etc etc. But you need to first figure out with StatefulSet, how do you organize your pod(s) to form a HA cluster in the first place. – gohm'c Mar 08 '22 at 10:01
  • @gohm'c. True. I have added my es statefulset. Will that work? – devops-admin Mar 08 '22 at 10:36
  • Are you using GKE or kubernetes (k8s)? Have you created persistent volumes and statefulset as given in the [document](https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/preexisting-pd#pv_to_statefulset) and have you deployed your statefulset application? – Fariya Rahmat Mar 08 '22 at 13:10
  • @devops-admin Has your issue been resolved or are you still facing it ? – Fariya Rahmat Mar 15 '22 at 12:26

1 Answers1

0

StatefulSet is a preference for manually creating a PersistentVolumeClaim. For database workloads that can't be replicated, you can't set replicas: greater than 1 in either case, but the PVC management is valuable. You usually can't have multiple databases pointing at the same physical storage, containers or otherwise, and most types of Volumes can't be shared across Pods.

Postgres doesn't support multiple instances sharing the underlying volume without massive corruption so if you did set things up that way, it's definitely a mistake. More common would be to use the volumeClaimTemplate system so each pod gets its own distinct storage. Then you set up Postgres streaming replication yourself.

Refer to the document on Run a replicated stateful application and stackpost for more information.

Fariya Rahmat
  • 2,123
  • 3
  • 11