15

I have data in an AWS RDS, and I would like to pipe it over to an AWS ES instance, preferably updating once an hour, or similar.

On my local machine, with a local mysql database and Elasticsearch database, it was easy to set this up using Logstash.

Is there a "native" AWS way to do the same thing? Or do I need to set up an EC2 server and install Logstash on it myself?

Sam Fen
  • 5,074
  • 5
  • 30
  • 56

2 Answers2

20

You can achieve the same thing with your local Logstash, simply point your jdbc input to your RDS database and the elasticsearch output to your AWS ES instance. If you need to run this regularly, then yes, you'd need to setup a small instance to run Logstash on it.

A more "native" AWS solution to achieve the same thing would include the use of Amazon Kinesis and AWS Lambda.

Here's a good article explaining how to connect it all together, namely:

  • how to stream RDS data into a Kinesis Stream
  • configuring a Lambda function to handle the stream
  • push the data to your AWS ES instance
Val
  • 207,596
  • 13
  • 358
  • 360
  • 3
    How would AWS Database Migration Service compare with Kinesis for this? (now that AWS DMS supports AWS ElasticSearch as a target) – James Daily Jun 04 '19 at 15:50
  • 1
    The first method of pointing to aws es instance as the output can be a bit tricky. You may need to use an ssh tunnel, diable ssl checking and disable ilm since aws doesn't support x_pack( I had to change the logstash source to get that last part working) – Joel Cahalan May 27 '20 at 18:58
  • @JoelCahalan Are there any resources you can point me to for how to achieve this? I'm completely new to logstash and wading into the source sounds like a nightmare... – jsindos Nov 04 '21 at 23:47
1

Take a look at Amazon DMS. Its usually used for DB migrations, however, it also supports continuous data replication. This might simplify the process and be cost-effective.

You can use AWS Database Migration Service to perform continuous data replication. Continuous data replication has a multitude of use cases including Disaster Recovery instance synchronization, geographic database distribution and Dev/Test environment synchronization. You can use DMS for both homogeneous and heterogeneous data replications for all supported database engines. The source or destination databases can be located in your own premises outside of AWS, running on an Amazon EC2 instance, or it can be an Amazon RDS database. You can replicate data from a single database to one or more target databases or data from multiple source databases can be consolidated and replicated to one or more target databases.

https://aws.amazon.com/dms/

Ben Yitzhaki
  • 1,376
  • 16
  • 31