1

I heard that statefulset is suitable for database. But StatefulSet will create different pvc for echo pod. If I set the replicas=3.then I get 3 pod and 3 different pvc with different data. For database users,they only want a database not 3 database. So Its clear we should not use statefulset in this situation. But when should we use statefulset.

Pengbo Wu
  • 67
  • 8

3 Answers3

3

A StatefulSet does three big things differently from a Deployment:

  1. It creates a new PersistentVolumeClaim for each replica;
  2. It gives the pods sequential names, starting with statefulsetname-0; and
  3. It starts the pods in a specific order (ascending numerically).

This is useful when the database itself knows how to replicate data between different copies of itself. In Elasticsearch, for example, indexes are broken up into shards. There are by default two copies of each shard. If you have five Pods running Elasticsearch, each one will have a different fraction of the data, but internally the database system knows how to route a request to the specific server that has the datum in question.

I'd recommend using a StatefulSet in preference to manually creating a PersistentVolumeClaim. For database workloads that can't be replicated, you can't set replicas: greater than 1 in either case, but the PVC management is valuable. You usually can't have multiple databases pointing at the same physical storage, containers or otherwise, and most types of Volumes can't be shared across Pods.

David Maze
  • 130,717
  • 29
  • 175
  • 215
  • Thank you, Now I got it .So the main reason is database itself knows how to replicate data between different copies.I got it.Thank you – Pengbo Wu Jun 29 '21 at 05:57
2

We can deploy a database to Kubernetes as a stateful application. Usually, when we deploy pods they have their own storage, but that storage is ephemeral - if the container kills its storage, it’s gone with it.

So, we’ll have a Kubernetes object to tackle that scenario: when we want our data to persist we attach a pod with a respective persistent volume claim. By doing so, if our container kills our data, it will be in the cluster, and the new pod will access the data accordingly.

Some limitations of using StatefulSet are:

1.Required use of persistent volume provisioner to provision storage for pod-based on request storage class.

2.Deleting or scaling down the replicas will not delete the volume attached to StatefulSet. It ensures the safety of the data.

3.StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods.

4.StatefulSet doesn’t provide any guarantee to delete all pods when StatefulSet is deleted, unlike deployment, which deletes all pods associated with deployment when the deployment is deleted. You have to scale down pod replicas to 0 before deleting StatefulSet.

Goli Nikitha
  • 858
  • 3
  • 9
1

stateful set useful for running the application which stores the state basically.

Stateful set database run the multiple replicas of POD and PVC however internally they all auto sync. Data sync across the pods and PVC.

So ideally it's best option to use the stateful sets with multiple replicas to get the HA database.

Now it depends on the use case which database you want to use, it supports replication or not clustering, etc.

here is MySQL example with replication details : https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/

Harsh Manvar
  • 27,020
  • 6
  • 48
  • 102