1

We use k8s deployment as a laravel queue worker. The runtime is alpine 3.10 and php 7.3 fpm with laravel 5.6. Our resource limits are requests: 512MB and limits 1Gi.

we are running 8 replicas to offload the incoming messages from SQS and, we are using. messages to queue are dispatched via kubernetes cron jobs

php /var/www/artisan queue:work ${CHANNEL} -vvv --tries=3 --sleep=3 --timeout=3600 --memory=${MEMORY}

where CHANNEL is the queue name (SQS) and MEMORY is the memory limit passed to the laravel worker. on average each pod is always processing 170 + messages which talks to various third party apis and stuff.

problem:

intermittently our pods are restarting with an error code 139,

SIGSEGV, Segmentation fault.

This is impacting our production systems as our pods restart while there is a message in process.

James Z
  • 12,209
  • 10
  • 24
  • 44

1 Answers1

0

This is a community wiki answer as it only addresses the issue from the docker container side. Feel free to expand on this as you wish.

The error code that you see indicates that container received SIGSEGV:

SIGSEGV indicates a segmentation fault. This occurs when a program attempts to access a memory location that it’s not allowed to access, or attempts to access a memory location in a way that’s not allowed. From the Docker container standpoint, this either indicates an issue with the application code or sometimes an issue with the base images used by the container.

In that case you should make sure that you are not using some old Docker versions and than try to test your code inside the container with a debugger. I am not familiar enough with this topic to guide you further but this SO question might be useful for you.

Wytrzymały Wiktor
  • 11,492
  • 5
  • 29
  • 37