0

I recently took over architecture from a 3rd party to help a client. I'm new to AWS, so this is probably simple, and I just couldn't find it in the docs/stack overflow. They had an existing EC2 instance that had both a node app and a react app deployed, from different repos. Each were deployed using their own pipeline. The source, build, and deploy steps were working for both, and I verified the artifacts were being generated and stored in S3. The load balancer had a target group that hit a single machine in one subnet. The app was running just fine until this morning, and I'm trying to figure out if it's something I did.

My goal this morning was to spin up a new EC2 instance (for which I have the keys, so I can connect directly), a new load balancer that pointed to my machine, and space in S3 for new pipelines I created to store artifacts. I created an AMI from their EC2 instance with the running app and used it to provision my own on the same subnet as their instance. I used the existing security group for my machine. I created a target group to target my machine for use with my load balancer. I created a load balancer to route traffic to this new machine. I then created two pipelines, similar to theirs, but with different artifact locations in S3, and a source of my own repo where I have a copy of the code. I got deployments through the pipeline to work. Everything was great until I was about to test my system, when I was informed their app was down.

I tried hitting it and got a 502, bad gateway. I checked the load balancer and it sees traffic coming in, but gave a 502 for all responses. I checked the target group and it's now showing their EC2 instance as unhealthy. I tried rebooting the machine, but it's still unhealthy, then I tried creating another version of their machine in another subnet, and ensured it was targeted by the target group, but the new instance showed up as unhealthy as well. I can't SSH into the machine because I don't have the key used to create the EC2 instance. If anyone knows where I should look to bring it back online, I'd be forever in your debt.

I undid everything I created this morning, stopping my EC2 instance, and deleting my load balancer, but their app is still returning a 502, showing the instance as unhealthy in their target group.

autoboxer
  • 1,358
  • 1
  • 16
  • 37

1 Answers1

1

These are some things to help you debug:

  • You first need to access the EC2 directly and not through the Load Balancer. Check that the application is running. If the EC2 is in private VPC, you can start an EC2 instance with a public IP and use it as a bastion host.
  • You will need to have SSH access to the EC2 machine at some point, so that you can look at the logs. This question has answers on how to replace the key pair.
kgiannakakis
  • 103,016
  • 27
  • 158
  • 194
  • Thank you. I'm going to work on getting into the EC2 instance to see logs. Do you know where the logs would be located, I'm not sure what the default is, or where to look to see if a custom location was specified. Also, the EC2 instance is private, and it's in a private subnet. Can I just give it a public IP address, or should I go through the bastion host setup? – autoboxer Oct 29 '21 at 19:14
  • You need to find, where the logs of the node application are. Also, do you have an apache or nginx server? You can't just give it a public IP. You need to set up the bastion host as explained in the link. If you don't already have one, you will need to create a public subnet in the VPC. The load balancer is probably already in a public subnet. Also, what kind is your load balancer (application, network, classic)? Do you have a custom domain and/or certificates for SSL? – kgiannakakis Oct 29 '21 at 19:23