3

I am working on a application that needs to store huge transactions (2millions per day) and needs fulltext search on it. I need to maintain atleast 10years of data. Keeping performance and data integrity in mind can I use aws elasticsearch as database for my project?

Venkat Papana
  • 4,757
  • 13
  • 52
  • 74
  • This answer should help: https://stackoverflow.com/questions/49629830/using-elasticseach-as-primary-source-for-part-of-my-db/49630227#49630227 – Val Jun 14 '18 at 03:18
  • I see main issue is with elastic down/corrupt; but if I go with aws service, can I rely with their uptime and daily backup policies. – Venkat Papana Jun 14 '18 at 06:41
  • Personally, I'd go with [Elastic Cloud](https://www.elastic.co/cloud) which is much more flexible and configurable by the end user, backed by AWS and herded/monitored by the folks who created ES. – Val Jun 14 '18 at 06:45
  • Thanks @Val. So, do you suggest using could based elastic solution (aws/elastic cloud) as a primary database. – Venkat Papana Jun 14 '18 at 06:50
  • Your question is too broad to be answered actually. But I usually never suggest using ES as a primary database, especially if you need to guarantee data integrity during 10 years, but that really depends on your use case. – Val Jun 14 '18 at 07:02
  • Probably you mean huge volume. What if you need to reindex or change your search fields? – techuser soma Jun 15 '18 at 16:21
  • hello @techuser, I'm new to ES, can you explain me problems with reindexing the huge volume. Regarding search, we are always doing full text search, do you see any problems with this search. – Venkat Papana Jun 18 '18 at 12:02

1 Answers1

1

As always, it depends. It depends on your requirements for the data store.

  1. Are these transactions coming out of one of your own systems that stores the data and so that data is easily repayable if your index was to get corrupted or if you wanted to reindex / change the purpose of the data? Of course there are backups that you can leverage but if you are getting so much data in you would lose data if the index did get corrupted.
  2. What other things do you want to do with the data? is it just there to search? Do you want to aggregate or join it with other data? run reports on it?

I agree with @Val (in the comments) and would not recommend elastic as your primary datastore, have a read of this for some more good advise. But in the end it depends, elastic search is a great place to put log data for example. Have a read of this and this for more advise on what elastic search is useful for.

I'm interested to know which direction you went with this (given your question was 3 months ago) and if you support or regret your final decision. Do you have any good learning for us?

Damo
  • 5,698
  • 3
  • 37
  • 55