10

I'm trying to deploy an ECS service that uses only a UDP port. Support has been added for UDP load-balancing via NetworkLoadBalancers, so I've deployed my service allowing dynamic host port assignment for my tasks and setup the NLB with an appropriate listener and target group.

The problem I'm running into is that healthchecks are apparently mandatory for the NLB, and must be TCP based. For the healthcheck port, you can leave the default "target" port (which works fine for dynamic host port assignment) or you can specify a port. What I can't do is have a different port exposed for TCP than my load balancer target UDP port. I could have my container listen to both UDP for production and TCP for healthchecks on the same port, but the task definition seems to disallow that even though Docker supports it fine.

This would seem to make UDP NLB's useless for ECS services, unless there's something I'm missing? The only alternative I've come up with is to use statically configured host ports so I can expose a second port for TCP on a controlled host port and target that for the NLB healthcheck. The problem with that is we've now lost all of the scalability benefit of ECS by not being able to run more than one task on an instance.

tdimmig
  • 680
  • 9
  • 24

4 Answers4

2

Looks like their is an development, for the above issue.

https://github.com/GetSimpl/cloudlift/pull/43

https://github.com/aws/containers-roadmap/issues/850

sadath pasha
  • 49
  • 11
1

What you can do is setup a sidecar container along side your UDP container that supplies the TCP endpoint for health checks.

Here is a truncated example of the ECS Task Definition for the service that is running in our NLB target group:

{
    "containerDefinitions": [
        {
            "image": "[your-udp-image]",
            "essential": true,
            "portMappings": [
                {
                    "containerPort": 5008,
                    "protocol":"udp"
                }
            ]
        },
        {
            "image": "[your-tcp-health-check-image]",
            "essential": true,
            "portMappings": [
                {
                    "containerPort": 5006,
                    "protocol":"tcp"
                }
            ],
            "healthCheck": {
                "command": [ "CMD-SHELL", "curl -f http://localhost:5006 || exit 1" ],
                "interval": 10,
                "timeout": 5,
                "retries": 3,
                "startPeriod": 120
            }
        }
    ]
}

Then your target group's health check settings can just point to the TCP path and port of your health check container.

Matt Fiocca
  • 1,533
  • 13
  • 21
1

Following on from @Matt Fiocca's answer, here is a sidecar container definition I use to provide a TCP health check endpoint for my UDP ECS service. It's a tiny 1MB web server that just always sends a HTTP 200 for every request to port 8080.

    {
        "name": "healthcheck",
        "image": "busybox:latest",
        "essential": true,
        "portMappings": [
            {
                "containerPort": 8080,
                "hostPort": 8080,
                "protocol": "tcp"
            }
        ],
        "entryPoint": ["sh", "-c"],
        "command": [
            "while true; do { echo -e 'HTTP/1.1 200 OK\r\n'; echo 'ok'; } | nc -l -p 8080; done"
        ]
    }
Justin Lewis
  • 1,261
  • 1
  • 15
  • 33
  • Can you please explain how sidecar container solves the issue? From my understanding, I will still need to map UDP port as trafic port in task definition and I will be required to map additional TCP port for health check regardless the fact that the listener will be in a separate container. And it looks like it will make mapping these ports dynamicaly impossible as well. Thanks! – Eduard Grinberg Oct 24 '22 at 07:03
  • The healthcheck is mandatory - if you don't have one, the NLB will never register any of your targets as healthy. The healthcheck must be TCP based - so this sidecar container lets your targets become healthy from the POV of the NLB by "dummying" a healthy response. If you wanted, you could implement a more complex healthcheck sidecar which actually talked to your main container. – Justin Lewis Oct 26 '22 at 10:18
  • But it doesn't solve the problem stated in the question: "The problem with that is we've now lost all of the scalability benefit of ECS by not being able to run more than one task on an instance." You still have to map static port and it makes running multiple instances on single EC2 impossible. – Eduard Grinberg Oct 26 '22 at 13:03
  • Ah yes, good point. I’m using Fargate, so I don’t have that limitation. It seems I didn’t read the question closely enough. I don’t have a valid answer for the EC2 launch type. – Justin Lewis Oct 27 '22 at 21:45
0

For now there is no way to host multiple instances of UDP service ECS task on single EC2 host (confirmed it with AWS Support team).

The reason is:

  1. you have to define Health Check in NLB TG and it can be only TCP or HTTP
  2. So you need to map both TCP and UDP ports in Task Definition - UDP for trafic and TCP for health-check
  3. In order to host multiple instances, you have to make the ports mapping dynamic (hostPort: 0) and set Health Check Port in TG to "traffic-port"
  4. ECS can assign different ports to UDP and TCP in thi case and this causes health check to always fail and the service never stabilizes.

Indeed the request (https://github.com/aws/containers-roadmap/issues/850) seems to solve the issue since you'll be able to map same dynaic port for UDP trafic and TCP health check. But it is opened a long time already...

Eduard Grinberg
  • 129
  • 2
  • 8