I am under Spring Data JPA 2.5.5 / Hibernate 5.4.32 / PostgreSQL 12.7.0.
I have a database table 'A', and I have mainly 2 use-cases:
- Add a large number of new data (new in terms of datetime) into 'A' (at least 1 million per minute)
- Fill empty intervals of old data (old in terms of datetime), again potentially 1 million of records per minute: it means insert if not exists, there will be already existing data in the input data
I have the following schema:
@Embeddable
public class TableId implements Serializable {
@Column(name = "datetime", nullable = false)
private OffsetDateTime datetime;
@Column(name = "data")
private String data;
(getters/setters...)
}
@Entity
@Table(name = "\"A\"")
public class AEntity implements Persistable<TableId> {
@EmbeddedId
private TableId id;
@Column(name = "value")
private Double value;
@Override
public boolean isNew() {
return true;
}
(getters/setters...)
}
the isNew()
method allows to fix the first case, which means not doing any select while inserting. Is this the best way ?
Now for the 2nd point, I do not know what is the best way to do it in terms of performance, any proposition ?
I found this but it is as expected damn slow:
@Query(nativeQuery = true, value = "select * from A a where CONCAT(cast(a.datetime as VARCHAR), '-', a.data) in (:columns)")
List<AEntity> findByIdIn(@Param("columns") List<String> columns);
Note that I cannot add an extra column in the database, and that the table contains billions of records.