The following code
public class Main {
public static void main(String[] args) throws IOException {
File tmp = File.createTempFile("deleteme", "dat");
tmp.deleteOnExit();
RandomAccessFile raf = new RandomAccessFile(tmp, "rw");
for (int t = 0; t < 10; t++) {
long start = System.nanoTime();
int count = 5000;
for (int i = 1; i < count; i++)
raf.setLength((i + t * count) * 4096);
long time = System.nanoTime() - start;
System.out.println("Average call time " + time / count / 1000 + " us.");
}
}
}
On Java 8, this runs fine (the file is on tmpfs so you would expect it to be trivial)
Average call time 1 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
On Java 10, this get increasing slower as the file gets larger
Average call time 311 us.
Average call time 856 us.
Average call time 1423 us.
Average call time 1975 us.
Average call time 2530 us.
Average call time 3045 us.
Average call time 3599 us.
Average call time 4034 us.
Average call time 4523 us.
Average call time 5129 us.
Is there a way to diagnose this kind of problem?
Is there any solution or alternative which works efficiently on Java 10?
NOTE: We could write to the end of the file, however this would require locking it which we want to avoid doing.
For comparison, On Windows 10, Java 8 (not tmpfs)
Average call time 542 us.
Average call time 487 us.
Average call time 480 us.
Average call time 490 us.
Average call time 507 us.
Average call time 559 us.
Average call time 498 us.
Average call time 526 us.
Average call time 489 us.
Average call time 504 us.
Windows 10, Java 10.0.1
Average call time 586 us.
Average call time 508 us.
Average call time 615 us.
Average call time 599 us.
Average call time 580 us.
Average call time 577 us.
Average call time 557 us.
Average call time 572 us.
Average call time 578 us.
Average call time 554 us.
UPDATE It appears that the choice of system call has changed between Java 8 and 10. This can be seen by prepending strace -f
to the start of the command line
In Java 8, the following calls are repeated in the inner loop
[pid 49027] ftruncate(23, 53248) = 0
[pid 49027] lseek(23, 0, SEEK_SET) = 0
[pid 49027] lseek(23, 0, SEEK_CUR) = 0
In Java 10, the following calls are repeated
[pid 444] fstat(8, {st_mode=S_IFREG|0664, st_size=126976, ...}) = 0
[pid 444] fallocate(8, 0, 0, 131072) = 0
[pid 444] lseek(8, 0, SEEK_SET) = 0
[pid 444] lseek(8, 0, SEEK_CUR) = 0
In particular, fallocate
does a lot more work than ftruncate
and the time taken appears to be proportional to the length of the file, not the length added to the file.
One work around is to;
- use reflection to the
fd
file descriptor - use JNA or FFI to call ftruncate.
This seems like a hacky solution. Is there a better alternative in Java 10?