I want to set up postgres and elasticsearch. But before throwing data into elasticsearch, I want to prevent data loss when network or server goes down. After reading on this topic: https://gocardless.com/blog/syncing-postgres-to-elasticsearch-lessons-learned/. I came up with 3 solutions.
Create a database table ei:
store
, and add any new/updated data to it.- During queries: insert data into
store
. - Select new data:
SELECT data FROM store WHERE modified > (:last modified time from elasticsearch)
- Send "new" data over to elasticsearch
- During queries: insert data into
Use redis to
pub/sub
requests, and make elasticsearch listen/subscribe for upcoming data. If elasticsearch breaks, the data will be in the queueCatch any errors during transaction to elasticsearch and save data into a safe place (ei:
store
table mentioned above). Then have a cron job pushing this data back.
Of course the easiest thing would be to insert data to elasticsearch straight away. But doing so prevents data to be stored in a safe place during corruptions. 1
is too slow in my opinion, unlike 2
. And 3
requires mantaining error handling code.
For now 2
is my option.
Are there better ways to do this? I'd like to hear your opinions and new suggestions
:D