0

I have the following kubernetes yaml file:

1   apiVersion: apps/v1
  1 kind: Deployment
  2 metadata:
  3   name: postgres-deployment
  4 spec:
  5   replicas: 1
  6   selector:
  7     matchLabels:
  8       component: postgres
  9   template:
 10     metadata:
 11       labels:
 12         component: postgres
 13     spec:
 14       securityContext:
 15           runAsUser: 999
 16           runAsGroup: 999
 17           fsGroup: 999
 18       volumes:
 19         - name: postgres-storage
 20           persistentVolumeClaim:
 21             claimName: postgres-persistent-volume-claim
 22       containers:
 23         - name: postgres
 24           image: prikshet/postgres
 25           ports:
 26             - containerPort: 5432
 27           volumeMounts:
 28             - name: postgres-storage
 29               mountPath: /var/lib/postgresql/data
 30               subPath: postgres
 31           imagePullPolicy: Always

But the pod is giving the following logs:

2021-08-11 06:06:15.749 GMT [8] LOG:  skipping missing configuration file "/var/lib/postgresql/data/postgresql.auto.conf"
2021-08-11 06:06:15.750 UTC [8] FATAL:  data directory "/var/lib/postgresql/data" has wrong ownership
2021-08-11 06:06:15.750 UTC [8] HINT:  The server must be started by the user that owns the data directory.

Whereas clearly I have the fsGroup, runAsUser and runAsGroup specified. What might be causing this error?

zendevil.eth
  • 974
  • 2
  • 9
  • 28
  • Configuration looks OK, starting your image locally, default UID matches your fsGroup ... next, you could try to add some `command: [ /bin/sh ]` and `args: [ '-c', 'sleep 86400' ]`. Enter your container once it's started and check permissions of your /var/lib/postgresql/data directory, check your uid, /etc/passwd. Try to start postgres manually if you don't see anything suspicious, ... Also: check without attaching your PVC: can you reproduce? If not, what kind of PVC are we talking about? Block devices? NFS shares? ... – SYN Aug 11 '21 at 06:32
  • Without attaching pvc it works – zendevil.eth Aug 11 '21 at 07:03
  • what about permissions on the volume once mounted, on a debug pod / substituting your postgres container entrypoint? Is it consistent with your configuration? Can you write a file? What kind of PVC are you using? – SYN Aug 11 '21 at 07:46
  • Does this answer your question? [How to change the owner of a mounted volume in kubernetes?](https://stackoverflow.com/questions/68736064/how-to-change-the-owner-of-a-mounted-volume-in-kubernetes) – Mikołaj Głodziak Aug 11 '21 at 10:57

2 Answers2

0

The problem here is that you might have many available PersistentVolumes backed with local storage and the postgres pod is randomly assigned to one of them, but you only changed the ownership of one PV to the correct values. So if your pod is assigned to any of the other PVs it will crash.

In particular, this can happen on the next redeploy, even if the first deploy ran alright. That's because Kubernetes has a bug where deleted PersistentVolumeClaims keeps the PersistentVolumes in "Released" status forever (at least for the local storage class). So your pod will never be assigned to the "fixed" mount point, it will be stuck in Pending or assigned to another local PV with wrong permissions on its point.

You need to identify the PersistentVolume that is being used by the pod using something like kubectl get pv and then inside their mount points, change the owner and group of all local, available volumes to 999:999.

Some Postgres instances installed from Bitnami's Helm charts also raise this error but those permissions have to be changed to 1001:1001 instead on all available mount points. Bounded mount points do not have to be changed.

Zenul_Abidin
  • 573
  • 8
  • 23
-1

I would use an initContainer: to chmod just like you and this answer will might help you.init Container and security context

Chandra Sekar
  • 637
  • 4
  • 9