0

Once or twice a day some files are being uploaded to S3 Bucket. I want the uploaded data to be refreshed with the In-memory data of each server on every s3 upload. Note there are multiple servers running and I want to store the same data in all the servers. Also, the servers are scaling based on the traffic(also on start-up of the new server goes up and older ones go down means server instances will not be the same always).

Like I want to keep updated data in the cache.

I want to build an architecture where auto-scaling of the server can be supported. I came across the FAN-OUT architecture of AWS by using the SNS and multiple SQS from which different servers can poll.

How can we handle the auto-scaling of the queue with respect to servers? Or is there any other way to handle the scenario?

PS: I m totally new to the AWS environment. It Will be a great help for any reference.

1 Answers1

0

To me there are a few things that you need to have to make this work. These are opinions and, as with most architectural designs, there is certainly more than one way to handle this.

I start with the assumption that you've got an application running on an EC2 of some sort (Elastic Beanstalk, Fargate, Raw EC2s with auto scaling, etc.) and that you've solved for having the application installed and configured when a scale-up event occurs.

Conceptually I'd have this diagram: enter image description here

The setup involves having the S3 bucket publish likely s3:ObjectCreated events to the SNS topic. These events will be published when an object in the bucket is updated or created.

Next:

  1. During startup your application will pull the current data from S3.
  2. As part of application startup create a queue named after the instance id of the EC2 (see here for some examples) The queue would need to subscribe to the SNS topic. If the queue already exists then that's not an error.
  3. Your application would have a background thread or process that polls the SQS queue for messages.
  4. If you get a message on the queue then that needs to tell the application to refresh the cache from S3.
  5. When an instance is shut down there is an event from at least Elastic Beanstalk and the load balancers that your instance will be shut down. Remove the SQS queue tied to the instance at that time.

The only issue might be that a hard crash of an environment would leave orphan queues. It may be advisable to either manually clean these up or have a periodic task clean them up.

stdunbar
  • 16,263
  • 11
  • 31
  • 53