16

I am trying to make 2 services communicate over service discovery endpoint in AWS ECS service.

Example:

Service1: runs the Task Definition to run nginx and phpfpm

Service2: runs the Task Definition to run redis

Now, I need to make service1 container communicate to service2 container

As per the documentations and resource found on internet. This is what I have done and not able to achieve the need.

  1. We need to turn on service discovery (Done)
  2. Set proper service name and namespace which will work as service discovery endpoint (Done)
  3. Create task definition and create service with above property set (done)
  4. Now AWS will generate a SRV records on the Route53 (OK)

Now, when using the service discovery endpoint which is generally in format service_discovery_service_name.service_discovery_namespace.

The error logs shows , It's not able to resolve the name.

enter image description here

Tara Prasad Gurung
  • 3,422
  • 6
  • 38
  • 76
  • You need to create DNS `Type A` records instead of `SRV` in Route53 which assigns IPs for each service task. You will need `SRV` records only when your communication supports SRV records lookup i.e the client needs to know that it needs to perform SRV lookup and then get the IP. – Imran Jul 08 '19 at 03:11
  • @Imran yes, but aws ECS has that feature inbuilt right and the A records is generated too which is in turn pointing to the IP address of the instance – Tara Prasad Gurung Jul 08 '19 at 04:05
  • 2
    Which docker networking mode are you using in task definitions?. If you are not using `awspvc` then it will create only `SRV` types which then point to `A` types. When you do `nslookup myapp.local` then you will not get anything since it is of type `SRV` and not `A`. When you try `nslookup -type=srv myapp.local` then you will get SRV list and then you can try `nslookup {taskid}.myapp.local` gives IP of the container. Unless your client supports performing SRV lookup and then IP lookup, you are better of creating only `A` records. let me know if you need example, will post it as answer. – Imran Jul 08 '19 at 14:59
  • my task definitions networking mode is bridge and Its creating SRV which has the taskid and an A record pointing to the container ip. Please check the image uploaded in edited question section @Imran – Tara Prasad Gurung Jul 08 '19 at 15:06
  • 2
    That's what exactly I am saying!. Your Client(Service1) needs to know that it needs to perform `SRV` lookup of Service2 and then make the communication using details of SRV result(port and hostname). Ex - If your Service1 is nginx then [premium](http://nginx.org/en/docs/http/ngx_http_upstream_module.html#service) version of nginx [supports](https://stackoverflow.com/a/42115019/5030709) that. If your Service1 is `phpfpm`, I am not sure it supports SRV lookup communication. First [understand](https://anders.com/cms/263/Tutorial/SIP/DNS/SRV/djbdns) how `SRV` records are different from `A` type. – Imran Jul 08 '19 at 15:38
  • @Imran Thanks for making me super clear on what my problem is. I just need my webserver (service=nginx) to resolve SRV. It looks like its not possible in free NGINX. What do you recommend me next or please throw me some works if you have done or anything I can reference. Thanks alot – Tara Prasad Gurung Jul 08 '19 at 16:37

2 Answers2

18

Update 03/2022

AWS has now ENI Trunking which can increase how many ENIs can be attached to a given EC2 Instance Type in the VPC. This makes using awsvpc mode lot flexible with DNS A records and makes Service Discovery easier to configure for ECS Services.

Combining this with AWS App Mesh and AWS Cloud Map you can make ECS Service Discovery lot easier.

More info about ENI Trunking & App Mesh Examples. https://docs.aws.amazon.com/AmazonECS/latest/developerguide/container-instance-eni.html https://github.com/aws/aws-app-mesh-examples/tree/main/walkthroughs/howto-ingress-gateway


Original Answer

As per our conversation, here is bit summary of what's happening.

  • If Service1(nginx in your case) needs to interact with Service2(redis) with AWS ServiceDiscovery option and use of SRV records then Service1 needs to be aware that it needs to perform DNS SRV lookup instead of DNS A(Address) lookup.

  • You have multiple options here. First, if you want to continue to use the SRV records use then your client nginx needs to proxy redis upstream server with options of service and resolve which are available only in premium version of nginx. Check my sample nginx configuration I have tested at the bottom of the answer which works.

  • Also make sure, you create the AWS Service discovery name with prefix _http._tcp otherwise, I had issues configuration SRV resolve/service option in nginx configuration without the prefix.

aws ecs service

  • Other option, If you do not want to rely on SRV records but go to standard A record lookup then you will have to use awsvpc mode for containers and select A option.

enter image description here

  • With DNS A option then your query of service_discovery_service_name.service_discovery_namespace will work fine.
  • With DNS A option, there are some constraints. You cannot run multiple tasks on a given EC2 instance due to number of ENIs limit which can be attached depending EC2 instance family. Update Check 03/2022 modification above.

Sample nginx DNS SRV Options configuration:

stream {
    resolver 172.31.0.2;
    upstream redis {
        zone tcp_servers 64k;
        server redisservice.local service=_http._tcp resolve;
    }
    server {
        listen 12345;
        status_zone tcp_server;
        proxy_pass redis;
    }
}

Some references -

https://aws.amazon.com/blogs/aws/amazon-ecs-service-discovery/ https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-service-discovery.html

Imran
  • 5,542
  • 3
  • 23
  • 46
  • If not using the nginx-plus. I think I can use front-end service discovery (Elastic load balancer) and solve the issue. @Imran – Tara Prasad Gurung Jul 10 '19 at 10:50
  • @TaraPrasadGurung My other option mentioned above as well doesn't use nginx-plus but it has its caveats. Yep. If the volume is not that much high then ELB is good choice instead of nginx-plus. PS - It's always to nice to upvote when you accepted it as answer :). – Imran Jul 10 '19 at 12:56
8

I would like to elaborate @Imran detailed answer a bit more, since, most of the answer talks about SRV DNS Record Type and showing Nginx example only for a premium version of Nginx ( and SRV).

In case you work with ECS Fargate and configured A DNS Record. the most important thing is to configure a proper resolver.

From the docs:

Configures name servers used to resolve names of upstream servers into addresses, for example:

resolver 127.0.0.1 [::1]:5353;

The address can be specified as a domain name or IP address, with an optional port. If port is not specified, the port 53 is used. Name servers are queried in a round-robin fashion.

with that been said the resolver must resolve the Private DNS. therefore, we need to use the NS DNS Record. using 8.8.8.8 as a resolver won't work since this DNS can't resolve the Private DNS.

NS stands for ‘name server’ and this record indicates which DNS server is authoritative for that domain (which server contains the actual DNS records). A domain will often have multiple NS records which can indicate primary and backup name servers for that domain.

In order to get the DNS Resolver run the following command:

aws route53 list-resource-record-sets --hosted-zone-id %HOSTED_ZONE_ID% --query "ResourceRecordSets[?Type == 'NS']"

Pick one of the resource records and place it into the Nginx resolver (including the trailing .).

Nginx basic template:

events {
  worker_connections 768;
}

http {
  # DNS Resolver
  resolver ns-###.awsdns-####.com. valid=10s;
  gzip on;
  gzip_proxied any;
  gzip_types text/plain application/json;
  gzip_min_length 1000;
  fastcgi_buffers 16 16k; 
  fastcgi_buffer_size 32k;

  server {

    listen 80;
    
    location / {
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header Host $host;
          proxy_redirect   off;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          # This is the important part
          proxy_pass http://ecs-fargate-svc.local:8080;
    }

    location = /health-check {
      return 200 'all good';
    }

  }
}

Few points that need to consider:

  • Don't forget to add the mapping port (in my example 8080).
  • Make sure the Security group allows traffic within the VPC.
  • Since working with Fargate and we have limited logs, consider creating an EC2 instance in the VPC the ECS Fargate tasks located and try to curl\ping the URL\DNS Record.

My service discovery:

enter image description here

Documentations:

Nginx resolver

The name server (NS) record

Imran
  • 5,542
  • 3
  • 23
  • 46
Amit Baranes
  • 7,398
  • 2
  • 31
  • 53