Aim: sync elasticsearch with postgres database
Why: sometimes newtwork or cluster/server break so future updates should be recorded
This article https://qafoo.com/blog/086_how_to_synchronize_a_database_with_elastic_search.html suggests that I should create a separate table updates
that will sync elasticsearch's id
, allowing to select new data (from database) since the last record (in elasticsearch). So I thought what if I could record elasticsearch's failure and successful connection: if client
ponged back successfully (returned a promise), I could launch a function to sync records with my database.
Here's my elasticConnect.js
import elasticsearch from 'elasticsearch'
import syncProcess from './sync'
const client = new elasticsearch.Client({
host: 'localhost:9200',
log: 'trace'
});
client.ping({
requestTimeout: Infinity,
hello: "elasticsearch!"
})
.then(() => syncProcess) // successful connection
.catch(err => console.error(err))
export default client
This way, I don't even need to worry about running cron job (if question 1 is correct), since I know that cluster is running.
Questions
Will
syncProcess
run beforeexport default client
? I don't want any requests coming in while syncing...syncProcess
should run only once (since it's cached/not exported), no matter how many times Iimport
elasticConnect.js
. Correct?Is there any advantages using the method with
updates
table, instead of just selecting data from parent/source table?The articles' comments say "don't use timestamp to compare new data!".Ehhh... why? It should be ok since database is blocking, right?