2

We use Xodus for a remote probe project in order to store temporary data before sending them to the centralized database. Thus, we have several stores which can grow or decrease accordingly to the environment (traffic, network connection, etc...). Thanks to the garbage collector, we expected to see decrease in the database file size but for the moment, it has only increased.

We tried several garbage collector configurations to trigger it as frequently as possible. For example, we have :

    conf.setGcFileMinAge(1);
    conf.setGcFilesInterval(1);
    conf.setGcMinUtilization(1);

Without visible effects...

After the store has been emptied, we expected to see reducing or deletion of .xd files but the database keeps growing and growing.

EDIT : I try to see GC effects with a simpler code as below :

        Environment exodus = Environments.newInstance(dbPath);

        final Transaction xtxn = exodus.beginExclusiveTransaction();
        Store store = exodus.openStore("testStore", StoreConfig.WITHOUT_DUPLICATES, xtxn);
        xtxn.commit();

        Thread.sleep(10 * 1000); // Wait to do actions after first  background cleaning cycle

        // Fill store, then clear it
        exodus.executeInExclusiveTransaction(tx -> {
            for(int i = 1; i <= 1000000; i++) {
                store.putRight(tx, LongBinding.longToEntry(i), StringBinding.stringToEntry(dbPath));
            }
        });
        clearStore(exodus, store);

        exodus.gc();
        Thread.sleep(5 * 60 * 1000); // Wait to see GC doing the work

    boolean clearStore(final Environment exodus, final Store store) {
        Transaction tx = exodus.beginExclusiveTransaction();
        try(Cursor cursor = store.openCursor(tx)) {
            boolean success = true;
            while(cursor.getNext() && success) {
                success &= cursor.deleteCurrent();
            }
            if(success) {
                tx.commit();
                return true;
            } else {
                log.warn("failed to delete entry {}", cursor.getKey());
                tx.abort();
                return false;
            }


        } catch(Exception e) {
            tx.abort();
            return false;
        }
    }

If I remove the first "sleep", Garbage Collector is doing the work, the database file size is reduced as expected, everything is ok. But if I keep the first "sleep", Garbage Collector never seems to be called. It's like the first background cleaning cycle is ok, but not the following ones... I keep default configuration in this example.

ATU
  • 21
  • 2

1 Answers1

1

There is the Environment.gc() method. The javadoc for the method is as follows:

Says environment to quicken background database garbage collector activity. Invocation of this method doesn't have immediate consequences like freeing disk space, deleting particular files, etc.

I wouldn't recommend modifying default GC settings. EnvironmentConfig.setGcMinUtilization() can be used to keep the database more compact than it would be by default, or to decrease GC load (e.g., in parallel with batch updates). Basically, higher required minimum utilization (less admissible free space) results in higher GC load.

GC cleans the database file by file, selecting files with least utilization first. When a file is cleaned it is not deleted immediately, two conditions should be satisfied:

  1. A delay configured by EnvironmentConfig.getGcFilesDeletionDelay() should pass. It's 5 seconds by default.
  2. Any transaction (even read-only), created before the moment when the file is cleaned, should be finished (committed or aborted).
Vyacheslav Lukianov
  • 1,913
  • 8
  • 12