I am new to Java and I do not know the differences between java collection implementations.
I have to process up to 100K records of imported data. There might be duplicates on that list. I have to put all that into DB. Before import I clean the database table, so there are no duplicates in DB at the beginning.
A am batch inserting the data with hibernate. I want to do something like this:
SomeCollectionClass<Integer> alreadyInsertedRecords;
//...
if (!alreadyInsertedRecords.contains(currentRecord.hashCode()) {
save_to_database(currentRecord);
alreadyInsertedRecords.put(currentRecord.hashCode());
} else {
logger.log("Record no 1234 is a duplicate, skipping");
}
Which collection class should I use to check if the record is has been inserted to the db?
As I said there might be more than 100 000 records, so the collection should be fast to search, fast to insert and have small memory footprint.