4

I am running my backend using Python and Django with uWSGI. We recently migrated it to Kubernetes (GKE) and our pods are consuming a LOT of memory and the rest of the cluster is starving for resources. We think that this might be related to the uWSGI configuration.

This is our yaml for the pods:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: my-pod
  namespace: my-namespace
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 10
      maxUnavailable: 10
  selector:
    matchLabels:
      app: my-pod
  template:
    metadata:
      labels:
        app: my-pod
    spec:
      containers:
      - name: web
        image: my-img:{{VERSION}}
        imagePullPolicy: IfNotPresent
        ports:
          - containerPort: 8000
            protocol: TCP
        command: ["uwsgi", "--http", ":8000", "--wsgi-file", "onyo/wsgi.py", "--workers", "5", "--max-requests", "10", "--master", "--vacuum", "--enable-threads"]
        resources:
          requests:
            memory: "300Mi"
            cpu: 150m
          limits:
            memory: "2Gi"
            cpu: 1
        livenessProbe:
          httpGet:
            httpHeaders:
              - name: Accept
                value: application/json
            path: "/healthcheck"
            port: 8000
          initialDelaySeconds: 15
          timeoutSeconds: 5
          periodSeconds: 30
        readinessProbe:
          httpGet:
            httpHeaders:
              - name: Accept
                value: application/json
            path: "/healthcheck"
            port: 8000
          initialDelaySeconds: 15
          timeoutSeconds: 5
          periodSeconds: 30
        envFrom:
          - configMapRef:
              name: configmap
          - secretRef:
              name: secrets
        volumeMounts:
        - name: service-account-storage-credentials-volume
          mountPath: /credentials
          readOnly: true
      - name: csql-proxy
        image: gcr.io/cloudsql-docker/gce-proxy:1.11
        command: ["/cloud_sql_proxy",
                  "-instances=my-project:region:backend=tcp:1234",
                  "-credential_file=/secrets/credentials.json"]
        ports:
          - containerPort: 1234
            name: sql
        securityContext:
          runAsUser: 2  # non-root user
          allowPrivilegeEscalation: false
        volumeMounts:
          - name: credentials
            mountPath: /secrets/sql
            readOnly: true
      volumes:
        - name: credentials
          secret:
            secretName: credentials
        - name: volume
          secret:
            secretName: production
            items:
            - key: APPLICATION_CREDENTIALS_CONTENT
              path: key.json

We are using the same uWSGI configuration that we had before the migration (when the backend was being executed in a VM).

Is there a best practice config for running uWSGI in K8s? Or maybe something that I am doing wrong in this particular config?

Jakub
  • 8,189
  • 1
  • 17
  • 31
Felipe
  • 6,312
  • 11
  • 52
  • 70
  • I had same issue with uwsgi so i moved to gunicorn. Sir Meanwhile you can check this link https://stackoverflow.com/questions/25090573/django-memory-leak-possible-causes – Shree Prakash Aug 26 '19 at 16:11
  • Late reply, but just in case, this usually happens when you have debug enabled in your django settings. – zerg . Nov 21 '19 at 17:56

1 Answers1

1

You activated 5 workers in uwsgi, that could mean 5 times the need of memory if your application is using lazy-loading techniques (my advice: load everything at startup and trust pre-fork check this). However, you could try reducing number of workers and instead raising number of threads.

Also, you should drop max-requests, this makes your app reload every 10 requests, that's non-sense in a production environment (doc). If you have troubles with memory leaks, use reload-on-rss instead.

I would do something like this, maybe less or more threads per worker depending on how your app uses it (adjust according to cpu usage/availability per pod in production):

command: ["uwsgi", "--http", ":8000", "--wsgi-file", "onyo/wsgi.py", "--workers", "2", "--threads", "10", "--master", "--vacuum", "--enable-threads"]

ps: as zerg said in comment you should of course ensure your app is not running DEBUG mode, together with low logging output.

Vincent J
  • 4,968
  • 4
  • 40
  • 50
  • Hi, thank you for the response. I was also experiencing a lot of container restarts, that I think have to do with the max-requests, so the `reload-on-rss` helped a lot in this case. I also added the `die-on-term` param to make sure that the uWSGI gracefully stops on sigterm. – Felipe Apr 20 '20 at 11:31
  • My final config looks like this: `['uwsgi', '--http', ':8000', '--module', 'app:app', '--workers', '1', '--threads', '5', '--reload-on-rss', '600', '--master', '--vacuum', '--enable-threads', '--die-on-term']` – Felipe Apr 20 '20 at 11:32
  • be careful, with only one worker, if it crashes, it may freeze your pod during restart for several precious seconds. – Vincent J Apr 20 '20 at 13:18
  • 3
    But with two workers, isn't it possible to end up with a zombie worker? If one crashes or enters an invalid state, the other will keep running and answering the requests. With one worker, if it crashes, K8s will kill the pod and restart it. – Felipe Apr 20 '20 at 15:06
  • 1
    normally uwsgi service is supposed to handle the crash restart for you. I've never seen a "zombie" case on uwsgi for my part. However, I'm not sure kubernetes would detect uwsgi zombie process unless you set up liveness probes. Even with 30s periodic liveness probes, that takes a few second to detect and remove your pod. If the process is restarted for reload-on-rss, k8s will not see anything and requests will just queue at your uwsgi master waiting for the single process to restart. Best solution : use both. Two pods with two processes each. – Vincent J Apr 21 '20 at 08:35