29

Suddenly getting the message " ResourceInitializationError: failed to validate logger args: : signal: killed" while starting AWS ECS Fargate Service. Same service was running fine couple of days back.

Following is log driver configurations in related aws task:

Log Configuration
Log driver: awslogs
Key Value
awslogs-group /ecs/analytics
awslogs-region us-east-1
awslogs-stream-prefix ecs

Any idea or help?

Amit Kamboj
  • 291
  • 1
  • 3
  • 4
  • This questions is not appropriate for Stackoverflow - it is not a programming related question. You might try asking on Server Fault (serverfault.com) – Brian B Oct 05 '20 at 17:28
  • Did you ever find a solution? I'm having the exact same issue but haven't found any solution. It also seems like this post is the only one mentioning the issue. – davidgiga1993 Jan 15 '21 at 10:09

6 Answers6

37

I finally found the root cause:

The error appears if the fargate service is not able to connect to the CloudWatch api endpoint. This might happen if you have fargate running in a private subnet without internet access. You could either add the CloudWatch log Endpoint to your private subnet or add internet connectivity

davidgiga1993
  • 2,695
  • 18
  • 30
  • 10
    For people going for the endpoint way, the endpoint service to use is `com.amazonaws.eu-west-1.logs` . Complete list of endpoint here : https://docs.aws.amazon.com/general/latest/gr/aws-service-information.html – Alexandre Hamon Sep 08 '21 at 14:58
  • 2
    I would like to comment that I was only able to get over this error by *deleting* an existing log endpoint (in my case `com.amazonaws.use-ease-1.logs` service from my vpc. After that the error went away. – jdel3 Jan 20 '22 at 21:47
  • I recently added the VPC Endpoint for cloudwatch logs to my setup and this now prevents my existing ECS cluster from posting to cloudwatch logs when they have internet access. – Luke Feb 10 '22 at 17:12
  • @Luke when you add a private endpoint to your VPC it will override the DNS entries for that service. You'll need to make sure all ECS instances can reach the private endpoint – davidgiga1993 Feb 11 '22 at 07:57
  • Thanks @davidgiga1993, I have added an answer that works to this question – Luke Feb 11 '22 at 11:34
8

I recently spent hours on this same issue. It turns out that the log group and stream prefix specified in my container definition didn't exist.

It would be wonderful if AWS could provide helpful error messages...

Drakee510
  • 480
  • 1
  • 5
  • 15
  • 1
    Thanks @Drakee510! In addition to this, I also had to add the "CloudWatchLogsFullAccess" policy to the Execution Role. Perhaps there is a better way, but I haven't found it yet. – redgeoff Feb 16 '22 at 23:16
6

Came across this issue today. The issue was that the log group I specified didn't exist yet. If you don't want to manually create it, make sure to add the awslogs-create-group and set it to "true". You'll have to grant your ECS Task Execution role a logs:CreateLogGroup permission as well.

  "logConfiguration": {
    "logDriver": "awslogs",
    "secretOptions": null,
    "options": {
      "awslogs-create-group": "true",
      "awslogs-group": "/ecs/app",
      "awslogs-region": "ap-southeast-2",
      "awslogs-stream-prefix": "ecs"
    }
  }

Reference: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_awslogs.html

Renzo Sunico
  • 141
  • 2
  • 6
  • Right, but nobody seems to know how to specify the `logs:CreateLogGroup` permission inline in the CloudFormation template. See my question https://stackoverflow.com/q/75455047 . Any ideas? – Garret Wilson Feb 15 '23 at 22:51
4

I just experienced this. I have ECS Fargate running and I've just added a VPC endpoint for Cloudwatch Logs com.amazonaws.REGION.logs in my account. When I added the VPC endpoint my logs stopped appearing.

In order to remedy this without deleting the VPC endpoint again, for my setup with Fargate running with internet access I had to ensure that:

  1. My ECS service had a security group rule that to allows HTTPS traffic outbound

    {
       type: egress
       port_to: 443   
       port_from: 443
       protocol: TCP
    }
    
  2. That my new VPC Endpoint had a security group rule to allow HTTPS traffic inbound from my ECS security group

    {
       type: ingress
       port_to: 443   
       port_from: 443
       protocol: TCP
       source_security_group_id: [Your ECS SECURITY GROUP ID]
    }
    
Luke
  • 22,826
  • 31
  • 110
  • 193
0

I got this error, checked my NAT and IG, and all is good. And I found the endpoint interface also was set up as com.amazonaws.use-ease-1.logs Nothing seems to need to change. Finally, I deleted the interface endpoint and the error went away. But I am still confusing what happened.

Richard D
  • 123
  • 3
  • 9
0

Adding CloudWatchLogsFullAccess policy to ECS task execution role solved my problem (suggested by @redgeoff )

mirror
  • 345
  • 3
  • 10