Where to store data in a Kubernetes cluster

Question

How does pods that are controlled by a replication controller and "hidden" behind a service in Kubernetes write/read data? If I have an application that recieves images from the user that needs to be persisted, where do I store that? Because of the service in front I have no control over which node it is stored at if I use volumes.

score 4 · Answer 1 · edited May 23 '17 at 12:06

4

I think the "simple" answer to your question is that you will need shared storage under you Kubernetes cluster, so that every pods access the same data. Then it wouldn't matter where the pods are running and which pod is actually executing the service.

May be another solution would be Flocker, they describe themself in short:

Flocker is a data volume manager and multi-host Docker cluster management tool. With it you can control your data using the same tools you use for your stateless applications by harnessing the power of ZFS on Linux.

Anyway I think the storage question on Kubernetes or any other dockerized infrastructure is very interesting.

It looks like the google-app-engine doesn't support sharing data store between their apps by default like they pointed out in this SO Question

edited May 23 '17 at 12:06

Community

1
1

answered Dec 19 '14 at 13:10

joh.scheuer

566
3
11

How do I set up a shared storage under Kubernetes? How can multiple pods write/read to the same storage? – LuckyLuke Dec 19 '14 at 18:39
I don't know much about Hadoop but should I use something like that? Doesn't Google provide a solution to this? Or is NFS something to look at? – LuckyLuke Dec 19 '14 at 18:44
I think you have to decouple these two thing you will have a shared storage (Hadoop, Ceph, etc...) and you have Kubernetes running your services. You can easily mount the same volume via NFS as example. In this case it would be interessting how access controll is handeld over multiple services. What about something like the [google cloud storage]( https://cloud.google.com/storage)? You could also use a NFS because you can garantuee that every pods access the same storage. – joh.scheuer Dec 20 '14 at 20:34

score 3 · Answer 2 · edited Mar 25 '18 at 09:40

3

If you are running in Google Compute Engine, you can use a Compute Engine persistent disk (network attached storage) that is bound to the pod, and moves with the pod:

https://kubernetes.io/docs/concepts/storage/volumes/#gcepersistentdisk

We'd like to support other types of network attached storage (iSCSI, NFS, etc) but we haven't had a chance to build them yet. Pull requests welcome! ;)

edited Mar 25 '18 at 09:40

Farcaller

3,070
1
27
42

answered Dec 22 '14 at 19:42

brendan

4,116
3
15
7

score 0 · Answer 3 · answered Feb 04 '19 at 06:23

GKE lets you create disks to store data and this storage area can be linked to multiple pods.Notice that your cluster and disk must be in same zone/region.

gcloud compute disks create disk1 --zone=zone_name

now you can use this disk to store data from pods . This is a simple mongodb replicationcontroller yaml file that uses disk1 .This might not be an efficient way but its the simplest one that I know.

apiVersion: v1
kind: ReplicationController
metadata:
  labels:
    name: mongo
  name: mongo-controller
spec:

 replicas: 1
  template:
    metadata:
      labels:
        name: mongo
    spec:
      containers:
      - image: mongo
       name: mongo
        ports:
        - name: mongo
          containerPort: 27017
          hostPort: 27017
        volumeMounts:
            - name: mongo-persistent-storage
              mountPath: /data/db
      volumes:
        - name: mongo-persistent-storage
          gcePersistentDisk:
            pdName: disk1
            fsType: ext4

disk 1 will exists even your pod get deleted or replaced.

Where to store data in a Kubernetes cluster

3 Answers3