I've setup a github workflow to run my docker compose headless (node & postgres container) then run my jest tests. The issue is that about 70% of the time it's successful, database connections never been an issue, all secrets and configuration is working, all tests pass. Other times it will throw me a 137 exit code halfway through the tests and complete the tests 100% successful with a success code in the logs of the docker container.
docker-compose.actions.yaml:
name: ***
networks:
***-network:
external: false
services:
auth-postgres:
container_name: auth-postgres
deploy:
resources:
limits:
cpus: "2"
memory: 1G
reservations:
cpus: "1"
memory: 512M
env_file:
- auth/.env.docker
image: postgres:15-alpine
networks:
- ***-network
ports:
- 5432:5432
healthcheck:
test: sh -c pg_isready -d "$${POSTGRES_DB}" -U "$${POSTGRES_USER}"
interval: 10s
timeout: 60s
retries: 5
auth:
container_name: auth
deploy:
resources:
limits:
cpus: "6"
memory: 4G
reservations:
cpus: "2"
memory: 2G
env_file:
- auth/.env.docker
build:
context: auth
dockerfile: Dockerfile
entrypoint:
- ./tests-entrypoint.sh
depends_on:
auth-postgres:
condition: service_healthy
networks:
- ***-network
ports:
- 1337:1337
working_dir: /home/***/auth
volumes:
- /home/***/auth
Dockerfile:
FROM node:18
WORKDIR /home/***/auth
COPY *.sh .
COPY *.js .
COPY *.json .
COPY src/ ./src
COPY prisma/ ./prisma
RUN chmod +x ./tests-entrypoint.sh
RUN chmod +x ./entrypoint.sh
RUN npm ci
tests.yaml:
name: Unit Tests
on: [pull_request]
jobs:
unit-tests:
runs-on: self-hosted
env:
# removed
steps:
- uses: actions/checkout@v3
- name: Build docker compose
run: printenv > auth/.env.docker && make dockerActions # dockerActions = docker compose up -d
- name: Run tests
run: docker exec $(docker ps --latest --quiet) /bin/bash -- ./tests-entrypoint.sh
- name: Inspection
if: always()
run: docker inspect auth && docker inspect auth-postgres
- name: Logs
if: always()
run: docker logs auth && docker logs auth-postgres
tests-entrypoint.sh:
#!/bin/bash
npm test
if [ $? -eq 0 ]
then
echo "Tests job success."
exit 0
else
echo "Failure. " >&2
exit 1
fi
termination and the error inside github actions
container's log showing exit 0
self runner's logs showing exit 137
Is there a limit to the amount of resources that I am allowed to use while using a self hosted runner? I run these tests in docker locally in development and have never had this error occur for me, even without specifying resource limits (doesn't go anywhere near the limits either). The confusing part to me is that even when I was running these on the free tier runner it would still work sometimes, and when it would fail the container will still make it to the end with a success code. But now it's running on my machine, and is not over consuming 12 cores and 32gb ram while watching my resources.