33

I am trying to setup aws ecs fargate deployment configuration. I was able to run containers without container health check. But, I want to run container health checks too. I tried all possible scenarios to achieve this. But, no luck.

Container Health Check Command

i tried with the below aws recommeded commands to verify container health checks from the listed url.

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#container_definition_healthcheck

  1. [ "CMD-SHELL", "curl -f http://localhost/ || exit 1" ]
  2. [ "CMD-SHELL" "curl -f 127.0.0.1 || exit 1" ]

I tried with above both commands. But, none of them are working as expected. Please help me to receive container valid health checkup commands

Below is my DockerFile

    FROM centos:latest
    RUN yum update -y
    RUN yum install httpd httpd-tools curl -y
    EXPOSE 80
    CMD ["/usr/sbin/httpd", "-D", "FOREGROUND"]
    HEALTHCHECK CMD curl --fail http://localhost:80/ || exit 1
    FROM microsoft/dotnet:2.1-aspnetcore-runtime AS base
    WORKDIR /app
    EXPOSE 80
    FROM microsoft/dotnet:2.1-sdk AS build
    WORKDIR /DockerDemoApi
    COPY ./DockerDemoApi.csproj DockerDemoApi/
    RUN dotnet restore DockerDemoApi/DockerDemoApi.csproj
    COPY . .
    WORKDIR /DockerDemoApi
    RUN dotnet build DockerDemoApi.csproj -c Release -o /app
    FROM build AS publish
    RUN dotnet publish DockerDemoApi.csproj -c Release -o /app
    FROM base AS final
    WORKDIR /app
    COPY --from=publish /app .
    ENTRYPOINT ["dotnet", "DockerDemoApi.dll"]

I have added curl command inside my container and its working. But, if i keep the same command in AWS Healthcheck task, its failing.

Task Definition JSON:

    {
     "ipcMode": null,
     "executionRoleArn": "arn:aws:iam::xxxx:role/ecsTaskExecutionRole",
     "containerDefinitions": [{
     "dnsSearchDomains": null,
     "logConfiguration": {
     "logDriver": "awslogs",
     "secretOptions": null,
     "options": {
      "awslogs-group": "/ecs/mall-health-check-task",
      "awslogs-region": "ap-south-1",
      "awslogs-stream-prefix": "ecs"
     }
     },
     "entryPoint": [],
     "portMappings": [
     {
      "hostPort": 80,
      "protocol": "tcp",
      "containerPort": 80
     }
     ],
     "command": [],
     "linuxParameters": null,
     "cpu": 256,
     "environment": [],
     "resourceRequirements": null,
     "ulimits": null,
     "dnsServers": null,
     "mountPoints": [],
     "workingDirectory": null,
     "secrets": null,
     "dockerSecurityOptions": null,
     "memory": null,
     "memoryReservation": 512,
     "volumesFrom": [],
     "stopTimeout": null,
     "image": "xxxx.dkr.ecr.ap-south-
     1.amazonaws.com/autoaml/api/dev/alpine:latest",
     "startTimeout": null,
     "dependsOn": null,
     "disableNetworking": null,
     "interactive": null,
     "healthCheck": null,
     "essential": true,
     "links": [],
     "hostname": null,
     "extraHosts": null,
     "pseudoTerminal": null,
     "user": null,
     "readonlyRootFilesystem": null,
     "dockerLabels": null,
     "systemControls": null,
     "privileged": null,
     "name": "sample-app"
     }
     ],
     "placementConstraints": [],
     "memory": "512",
     "taskRoleArn": "arn:aws:iam::xxxx:role/ecsTaskExecutionRole",
     "compatibilities": [
     "EC2",
     "FARGATE"
     ],
     "taskDefinitionArn": "arn:aws:ecs:ap-south-1:xxx:task-definition/mall- 
     health-check-task:9",
     "family": "mall-health-check-task",
     "requiresAttributes": [{
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "ecs.capability.execution-role-ecr-pull"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "ecs.capability.task-eni"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.ecr-auth"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.task-iam-role"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "ecs.capability.execution-role-awslogs"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
     },
     {
     "targetId": null,
     "targetType": null,
     "value": null,
     "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
     }
     ],
     "pidMode": null,
     "requiresCompatibilities": [
     "FARGATE"
     ],
     "networkMode": "awsvpc",
     "cpu": "256",
     "revision": 9,
     "status": "ACTIVE",
     "proxyConfiguration": null,
     "volumes": []
     }
Arjun
  • 533
  • 2
  • 9
  • 23
  • Is your health check failing or Fargate throwing an error? – Haran Jun 17 '19 at 02:33
  • Yes. It is showing healthstatus as UNKNOWN always. – Arjun Jun 17 '19 at 06:35
  • I do not see any issue with the command you are passing unless some issue with Container health. One more info, did you mark your container as 'essential=true" – Haran Jun 17 '19 at 09:21
  • i didn't add that essential=true @Haran. But, what that will do?. – Arjun Jun 17 '19 at 11:50
  • i added essential=true and still its showing same – Arjun Jun 17 '19 at 12:49
  • if you don't provide essential flag, by default AWS considers it as an essential container. Leaving that part, How are you defining task definition? via Console or CLI or API. If using console try giving something like CMD-SHELL,curl -f http://localhost:8080/ || exit 1 i.e Without Square brackets. – Haran Jun 17 '19 at 13:45
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/195076/discussion-between-arjun-and-haran). – Arjun Jun 17 '19 at 17:26
  • what exactly does your container do? this `curl` command only really works if you are hosting an http environment on your image, what exactly does your image do? – JD D Jun 18 '19 at 02:09
  • this container is a web application @JDD. Please look into my dockerfile for more info – Arjun Jun 18 '19 at 08:01
  • @Arjun can you help me with how you were able to run Containers without a health check. I am also trying to do the same but m Task remains in PROVISION mode and then fails. My container is not a webApp its a Python Code listening to SQS – Gautam Naik Jun 12 '21 at 16:49

6 Answers6

19

The Documentation mentions the following:

When registering a task definition in the AWS Management Console, use a comma separated list of commands which will automatically converted to a string after the task definition is created. An example input for a health check could be:

CMD-SHELL, curl -f http://localhost/ || exit 1

When registering a task definition using the AWS Management Console JSON panel, the AWS CLI, or the APIs, you should enclose the list of commands in brackets. An example input for a health check could be:

[ "CMD-SHELL", "curl -f http://localhost/ || exit 1" ]

enter image description here

Did you verify your health check command? I mean, http://127.0.0.0 is valid, right? Check your container returns success response when you hit http://127.0.0.0 (without port).

Below is the example Task Definition. This is to start the tomcat server in a container and checking the health (localhost:8080)

  1. Modify the task definition as per needs (like Role Arn )
  2. Create an ECS Service and map the task definition.
  3. Create the configured log group.
  4. Start ECS service and your task should show as Healthy.
{
  "ipcMode": null,
  "executionRoleArn": "arn:aws:iam::accountid:role/taskExecutionRole",
  "containerDefinitions": [
      {
          "dnsSearchDomains": null,
          "logConfiguration": {
              "logDriver": "awslogs",
              "secretOptions": null,
              "options": {
                  "awslogs-group": "/test/test-task",
                  "awslogs-region": "us-east-2",
                  "awslogs-stream-prefix": "test"
              }
          },
          "entryPoint": null,
          "portMappings": [
              {
                  "hostPort": 8080,
                  "protocol": "tcp",
                  "containerPort": 8080
              }
          ],
          "command": null,
          "linuxParameters": null,
          "cpu": 0,
          "environment": [],
          "resourceRequirements": null,
          "ulimits": null,
          "dnsServers": null,
          "mountPoints": [],
          "workingDirectory": null,
          "secrets": null,
          "dockerSecurityOptions": null,
          "memory": null,
          "memoryReservation": null,
          "volumesFrom": [],
          "stopTimeout": null,
          "image": "tomcat",
          "startTimeout": null,
          "dependsOn": null,
          "disableNetworking": false,
          "interactive": null,
          "healthCheck": {
              "retries": 3,
              "command": [
                  "CMD-SHELL",
                  "curl -f http://localhost:8080/ || exit 1"
              ],
              "timeout": 5,
              "interval": 30,
              "startPeriod": null
          },
          "essential": true,
          "links": null,
          "hostname": null,
          "extraHosts": null,
          "pseudoTerminal": null,
          "user": null,
          "readonlyRootFilesystem": null,
          "dockerLabels": null,
          "systemControls": null,
          "privileged": null,
          "name": "tomcat"
      }
  ],
  "memory": "1024",
  "taskRoleArn": "arn:aws:iam::accountid:role/taskExecutionRole",
  "family": "test-task",
  "pidMode": null,
  "requiresCompatibilities": [
      "FARGATE"
  ],
  "networkMode": "awsvpc",
  "cpu": "512",
  "proxyConfiguration": null,
  "volumes": []
}
Aleksei Zyrianov
  • 2,294
  • 1
  • 24
  • 32
Haran
  • 1,040
  • 2
  • 13
  • 26
  • I tried this as well @Haran before posting this question. No change on output. If possible, if you have any working example please share, i will try that – Arjun Jun 17 '19 at 17:23
  • added a sample task definition. – Haran Jun 18 '19 at 01:51
  • I have added my dockerfile. Can you please figure out if any mistake on definition for healthcheck? – Arjun Jun 18 '19 at 05:30
  • 1
    Regarding Healthcheck, since you are passing the cmd in Task definition, it will override Dockerfile health check, ECS will start the container something like "docker run -d --health-cmd='curl http://localhost:8080 || exit 1' --health-interval=5s --health-timeout=3s tomcat". Please share your task definition json. First, verify whether your health check works in local Docker environment and then try in ECS-Fargate. – Haran Jun 18 '19 at 08:30
  • I have added my task definiton json. Please have a look @Haran. FYI- i have removed healthcheck command from my task defition since its failing – Arjun Jun 18 '19 at 09:22
  • In some cases. I don't know why (the container has its /etc/hosts file correctly configured), change localhost by 127.0.0.1 use to fix the problem. – Garry Dias Dec 11 '19 at 14:49
15

The docker image you are using, does it have curl installed part of package?.

Based on your screenshot, it looks like you are using httpd:2.4 docker image directly. If so, then curl is not part of the package.

You need to create your own docker image from above httpd:2.4 as base. Below is sample Dockerfile content to get curl part of image.

Example -

FROM httpd:2.4
RUN apt-get update; \
    apt-get install -y --no-install-recommends curl;

then build the image and push it your dockerhub account or private docker repo.

docker build -t my-apache2 .
docker run -dit --name my-running-app -p 80:80 my-apache2

Now with above image, you should be able to get healthcheck command working.

https://hub.docker.com/_/httpd

https://github.com/docker-library/httpd/blob/master/2.4/Dockerfile

Imran
  • 5,542
  • 3
  • 23
  • 46
  • yes @Imran. I installed curl as part of package. FYI - Please find the below code. `FROM centos:latest RUN yum update -y RUN yum install httpd httpd-tools curl -y EXPOSE 80` Issue still remains same. – Arjun Jun 17 '19 at 17:21
  • @Arjun can you edit the question with full details of your `Dockerfile`. Above code in comment is missing `ENTRYPOINT` details so I am not sure whether you are spinning up the service properly!. – Imran Jun 17 '19 at 20:20
  • I have added my DockerFile. Please let me know – Arjun Jun 18 '19 at 05:31
3

Was facing the same issue and found a solution for my usecase:

Three containers in one task definition, which are

  1. Nginx sidecar
  2. Two NodeJs Applications

Using ecs-params.yml file to declare healthchecks:

version: 1
  task_definition:
  task_execution_role: ecsTaskExecutionRole
  ecs_network_mode: awsvpc
  task_size:
    mem_limit: 2GB
    cpu_limit: 1024
  services:
    nginx-sidecar:
      healthcheck:
        test: curl -f http://localhost || exit 0
        interval: 10s
        timeout: 3s
        retries: 3
        start_period: 5s

    <service 2>:
      healthcheck:
          test: curl -f http://localhost:3023 || exit 0
          interval: 10s
          timeout: 3s
          retries: 3
          start_period: 5s

    <service 3>:
      healthcheck:
        test:  ["CMD", "curl", "-f", "http://localhost:3019/health"]
        interval: 10s
        timeout: 3s
        retries: 3
        start_period: 5s

Make sure that curl is available in your docker file and you are able to call it locally too

My Dockerfile:

FROM node:14.17-alpine

RUN apk add --update curl

You can include either of these commands for healthcheck in ecs-params.yml:

 test: curl -f http://localhost || exit 0

 test:  ["CMD", "curl", "-f", "http://localhost"]

Both are valid in my usecase. Hope this helps since none of the other answers were working for me.

Shaurya Dhadwal
  • 309
  • 2
  • 8
2

I don't know why but changing http://localhost to http://127.0.0.1 (not just 127.0.0.1) fixes the problem.

I followed what was suggested here and it fixed my health check issues.

Ben
  • 10,931
  • 9
  • 38
  • 47
Garry Dias
  • 445
  • 4
  • 10
0

From your task definition:

"healthCheck": null,

You need to define it there, not in the Dockerfile.

OJFord
  • 10,522
  • 8
  • 64
  • 98
0

I ran into a similar issue and the problem was in the docker image platform itself.

I was using Apple M1 to build a minimal Docker image based on Alpine Linux.

The AWS ELB health checks work but container HealthCheck always failed with UNKNOWN.

In my case I solved the issue by building the docker image with linux/amd64.

For future reference: docker buildx build --platform=linux/amd64 ...

I wonder if you are facing a similar problem.

user1129769
  • 46
  • 1
  • 2