1

I have a scheduler which polls data in every 4 hours and inserts into table based upon certain logic. I have also used @Transactional annotation and also I am checking every time whether data already exists in the table or not. If the record does not exist, it will insert. When I am multiple instances of SpringBoot application, each instance runs the scheduler and some data not all get duplicated. It means I found that table contains duplicate record. The table where I am inserting is an existing table of the application and few columns have not been defined with unique constraints. Please suggest me how I can maintain unique records in the database table even if scheduler runs from multiple instances. I am using Postgresql and SpringBoot.

PythonLearner
  • 1,416
  • 7
  • 22

1 Answers1

3

So, the direct answer to you question is to have unique identifier for each record in your table. Unique id from external API will be a perfect match. If you don't have one - you can calculate it manually. Consider an example:

@Entity @Table
public class Person {
    private String fieldOne;
    private String fieldTwo;
    @Column(unique=true)
    private String uniqueId;
    
    //in case you have to generate uniqueId manually
    public static Person fromExternalApi(String fieldOne, String fieldTwo) {
        Person person = new Person();
        person.fieldOne = fieldOne;
        person.fieldTwo = fieldTwo;
        person.uniqueId = fieldOne + fieldTwo;
    }
}

Then you will have unique index on DB side based on uniqueId field, and DB prevent you from duplicates.

Important - you can't use

@GeneratedValue(strategy = GenerationType.IDENTITY)

because it will generate new id each time you save object to DB. In you case you can save same object multiple times from different instances.

But in you case I would suggest another solution. The idea is to run scheduled task only once. And this is already answered here

Bohdan Petrenko
  • 997
  • 2
  • 17
  • 34