I need to synchronize a large XML file (containing 6 million records sorted by ID) against a SAP MaxDB database table.
This is my present strategy:
- read XML object from file and convert to Java bean
- load object with the same primary key from database (see below)
- if object does not exist, insert object
- if object exists and is changed, update object
- at the end of the process, scan the database table for (deleted) objects that are not contained in the file
For efficiency reasons, the "load database object" method keeps a cache of the next 1000 records and issues statements to load the next bunch of objects. It works like this:
List<?> objects = (List<?>)transactionTemplate.execute(new TransactionCallback() {
public Object doInTransaction(TransactionStatus status) {
ht.setMaxResults(BUNCH_SIZE);
ht.setFetchSize(BUNCH_SIZE);
List<?> objects = ht.find("FROM " + simpleName + " WHERE " +
primaryKeyName + " >= ? ORDER BY " + primaryKeyName, primaryValue);
return objects;
}
});
Unfortunately, for some constant values of BUNCH_SIZE (10,000) I get a SQL exception "result table space exhausted".
- How can I better optimize the process?
- How can I avoid the SQL exception / the bunch size problem?
The code that saves the changed objects is the following:
if (prevObject == null) {
ht.execute(new HibernateCallback(){
public Object doInHibernate(Session session)
throws HibernateException, SQLException {
session.save(saveObject);
session.flush();
session.evict(saveObject);
return null;
}
});
newObjects++;
} else {
List<String> changes = ObjectUtil.getChangedProperties(prevObject, currentObject);
if (hasImportantChanges(changes)) {
changedObjects++;
ht.merge(currentObject);
} else
unchangedObjects++;
}
}
While this code works in principle, it produces masses of database log entries (we are talking about more than 50 GB of log backups) if there are a lot of new or changed objects in the source file.
- Can I improve this code by using a lower transaction isolation level?
- Can I reduce the amount of database log data written?
- Maybe there is a problem with the database configuration?
I am very grateful for any help. Thanks a lot, Matthias