11

We use JAVA Spring (JPA) and in application we write, modify and delete data in PostgreSQL.

How to synchronize data between PostgreSQL and Elasticsearch? So, they will have same data.

Azzabi Haythem
  • 2,318
  • 7
  • 26
  • 32
iman
  • 303
  • 1
  • 11
  • 27
  • You could use https://github.com/kohesive/elasticsearch-data-import-handler – Simon Martinelli Feb 27 '18 at 13:56
  • Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it. – Mickael Mar 08 '18 at 14:59
  • 1
    This provides some directions: https://stackoverflow.com/questions/34477095/elasticsearch-replication-of-other-system-data/34477639#34477639 – Val Mar 08 '18 at 15:04

2 Answers2

13

I assume that you are using PosgreSQL as a primary storage to store your data and Elasticsearch as a secondary storage to index the data from PosgreSQL for searching.

I would suggest Logstash is the best option. You will have create a configuration file that connects PosgreSQL with elasticsearch. Use JDBC input plugin and elasticsearch output plugin and run Logstash with that configuration file. Every document will be synced from there onward.

You also have some other options:

Batch Sync: Write an application/script (perl/python) to pull the data from PosgreSQL to index into Elasticsearch. You can schedule this job to be executed at a specific time you wish.

Real-time Sync: Right after you insert/update a record into PostgreSQL successfully, you send a request in the same API to index/update this record to Elasticsearch.

Vijayakumar
  • 318
  • 3
  • 12
2

I agree with Vijay regarding the general approach.

Essentially you can use Logstash to log your PostgreSQL data by using the pgjdbc driver (simple .jar) in Logstash to connect to PostgreSQL directly and use logstash.conf to write an SQL query to fetch the data from PostgreSQL.

I recently wrote a comprehensive guide about Spring Boot + PostgreSQL and the ELK stack and how to set things up (incl. a GitHub repo link to a sample project) in Towards Data Science, which you can find here: https://towardsdatascience.com/a-hitchhikers-guide-to-spring-boot-elasticsearch-logstash-kibana-postgresql-and-docker-5602feaa9fd3

I hope it helps everyone using this stack (even if you do not use Spring Boot but just Spring)

And here is a direct link to GitHub if you do not want to read the article: https://github.com/tech4242/spring-elastic-genie

tech4242
  • 2,348
  • 2
  • 23
  • 33