0

I hope to find you well.

I could definitely use some feedback on this task. Basically I am trying to make a redis container accessible from outside the ECS cluster. In order to do so I need an ELB.

As suggested in this question: Health checking redis container with ALB the healthcheck for an ALB will fail because the redis container is not a web server. However why is it still not working with a NLB? I mean, the NLB operates at level 4, which the redis container responds to. Then why are the healthchecks failing?

I have tried baking in a lightweight webserver in a custom redis image, but unfortunately to no use.

This is my cdk stack configuration

interface QlashMainClusterStackProps extends cdk.StackProps {
    vpc: ec2.IVpc,
    AWS_REGION: string
    AWS_ACCOUNT: string
    qlashMainInstanceSecurityGroupId: string
}


export class QlashMainClusterStack extends cdk.Stack {

    readonly cluster: ecs.ICluster

    constructor(scope: Construct, id: string, props: QlashMainClusterStackProps) {
        super(scope, id, props)


        /*  QLASH-MAIN CLUSTER  */

        const qlashMainCluster = new ecs.Cluster(this, 'qlashMainCluster', {
            vpc: props.vpc,
            clusterName: 'qlashMainCluster',
            enableFargateCapacityProviders: true,
            defaultCloudMapNamespace: {
                name: 'qlash_main',
                vpc: props.vpc,
                useForServiceConnect: true
            }
        })

        this.cluster = qlashMainCluster


        /*  SERVICES  */

        // Redis

        const qmmRedisTaskDefinition = new ecs.FargateTaskDefinition(this, 'qmm_redisTaskNLB', {
            cpu: 256,
            memoryLimitMiB: 512,
        })

        const qmmRedisContainer = qmmRedisTaskDefinition.addContainer('qmm_redis_NLB', {
            image: ecs.ContainerImage.fromRegistry('redis:6.0-alpine'),
            containerName: 'qmm_redis_NLB',
            portMappings: [{ containerPort: 6379, name: 'redis-port' }],
            healthCheck: {
                command: ["CMD", "redis-cli", "-h", "localhost", "-p", "6379", "ping"],
                interval: cdk.Duration.seconds(25),
                timeout: cdk.Duration.seconds(25),
                retries: 5
            },
            logging: ecs.LogDriver.awsLogs({streamPrefix: 'qmm_redis_NLB'}),
        })

        const qmmRedisServiceNLB = new ecs_patterns.NetworkLoadBalancedFargateService(this, 'qmmRedisServiceNLB', {
            serviceName: 'qmmRedisServiceNLB',
            cluster: props.cluster,
            desiredCount: 1,
            taskDefinition: qmmRedisTaskDefinition,
            cloudMapOptions: {
                cloudMapNamespace: props.cluster.defaultCloudMapNamespace,
                name: 'qmm_redis_NLB',
                containerPort: 6379
            },
            listenerPort: 6379
        })
   }
}

Hopefully you guys have an idea of what I might do.

Thank you in advance and have a nice day :)

  • Please include your ECS task definition JSON, as well as all NLB and Target Group settings in your question. – Mark B Apr 11 '23 at 13:23
  • @MarkB I posted below because it wouldn't fit here :) – Ettore Pelosato Apr 11 '23 at 13:50
  • You posted the updated details as an answer. That is going to get deleted because that isn't how this site is supposed to be used. Please click the "Edit" button on your question, and add the necessary details in the question itself. – Mark B Apr 11 '23 at 14:08
  • @MarkB Oh my gosh I'm sorry (facepalm) my bad – Ettore Pelosato Apr 11 '23 at 14:20
  • You are using high-level CDK constructs, which create lower-level CDK constructs, which get converted to a CloudFormation template, which get deployed by CloudFormation to generate AWS Infrastructure. It's extremely difficult to debug an issue like a health check failure by only seeing this higher-level code. I've tried looking at the defaults the CDK uses, for all the values you didn't specify explicitly, but it's nearly impossible to determine what exactly was deployed in your AWS account. – Mark B Apr 11 '23 at 15:50
  • For example I can't really tell what security group rules were applied to your ECS task, or what the actual health check settings are on the load balancer's target group. I also can't tell if your load balancer is in TCP pass-through mode, or if it is using a TLS listener or something. I suspect at a minimum the security group for your ECS task may need to have the port opened, and the load balancer is probably **not** in TCP pass-through mode. The load balancer would have to be in TCP pass-through mode in order to work with Redis. – Mark B Apr 11 '23 at 15:53
  • To get further help, and possibly have someone provide a concrete answer to your problem, I suggest you copy/paste the task definition JSON from the AWS console into your question, and also all the details of the security group, and the network load balancer's listener settings, and target group settings (obtained from the AWS console), into your question. – Mark B Apr 11 '23 at 15:55
  • I think you have given me a clue though!! I think it's missing the security group to allow the NLB to do it's health checks (kind of like an ec2 instance needs the same). Will look into that. Thank You :) – Ettore Pelosato Apr 11 '23 at 16:01

1 Answers1

0

After literally days of torturing myself I got it to work. I'll share it hoping it will spare you an aneurism. Granted, this is very poorly handled and explained by AWS (surprise, surprise).

The concept is that you need to allow your NLB to send your containers healthchecks. In order to do so, you need can't use the built in ecs_patterns.NetworkLoadBalancer... because you can't assign it a security group.

This is what the solution will look like:


const qmmRedisServiceSecurityGroup = new ec2.SecurityGroup(this, 'qmmRedisSecurityGroup', {
            vpc: props.vpc,
            securityGroupName: 'qmmRedisSecurityGroup'
        })

        qmmRedisServiceSecurityGroup.addIngressRule(
            ec2.Peer.ipv4(props.vpc.vpcCidrBlock),
            ec2.Port.tcp(6379),
            'Allow inbound traffic to qmm_redis from resources in qlashMainClusterVpc'
        )


        const qmmRedisTaskDefinition = new ecs.FargateTaskDefinition(this, 'qmm_redisTaskNLB', {
            cpu: 256,
            memoryLimitMiB: 512,
...
        })

        const qmmRedisContainer = qmmRedisTaskDefinition.addContainer('qmm_redis_NLB', {
            image: ecs.ContainerImage.fromRegistry('redis:6.0-alpine'),
...
        })

/* PASS THE SECURITY GROUP HERE */

        const qmmRedisService = new ecs.FargateService(this, 'qmmRedisService', {
            serviceName: 'qmmRedisService',
            cluster: props.qlashMainCluster,
            desiredCount: 1,
            securityGroups: [qmmRedisServiceSecurityGroup],
            taskDefinition: qmmRedisTaskDefinition,
            cloudMapOptions: {
                container: qmmRedisContainer,
                name: 'qmm_redis',
                containerPort: 6379
            },
            serviceConnectConfiguration: {
                namespace: qlashMainClusterNamespace.namespaceName,
                services: [{ portMappingName: 'qmm_redis' }]
            }
        })

        const qmmRedisTargetGroup = new elbv2.NetworkTargetGroup(this, 'qmmRedisTargetGroup', {
            targetGroupName: 'qmmRedisTargetGroup',
            vpc: props.vpc,
            port: 6379,
            targets: [qmmRedisService]
        })
        
        const qmmRedisNLB = new elbv2.NetworkLoadBalancer(this, 'qmmRedisNLB', {
            loadBalancerName: 'qmmRedisNLB',
            vpc: props.vpc,
            vpcSubnets: {
                subnetType: ec2.SubnetType.PUBLIC,
            }
        })

        qmmRedisNLB.addListener('qmmRedisTargetGroupListener', {
            port: 6379,
            defaultTargetGroups: [qmmRedisTargetGroup]
        })
   }
}