Platform: OpenBSD, compiler: gcc, javac (OpenJDK version 17)
I have made a few sorting benchmarks in various languages, and I'm rather surprised by the performance of the Java program over the C program.
I have programmed the exact same sorting algorithms in both languages, and the Java program finishes almost twice as fast, all other languages are slower than the C implementation except the Java one.
The benchmarks involve running the sorting algorithm on a random array of numbers a set number of times.
I am compiling the program with -O3
and -Ofast
, so I cannot apply any more compiler optimizations.
The exact code can be found here, but here is an excerpt from it:
Java:
public static void benchmark(SortingFunction func, int arraySize, int numTimes, String name, BufferedOutputStream bo) throws IOException {
int[][] arrs = new int[numTimes][arraySize];
for (int i = 0; i < numTimes; i ++) {
arrs[i] = genRandArray(arraySize);
}
long start = System.nanoTime();
for (int i = 0; i < numTimes; i ++) {
func.sort(arrs[i]);
}
long end = System.nanoTime();
double time = (double)(end - start) / 1e9;
System.out.println("It took " + time + " seconds to do " + name + " sort " +
numTimes + " times on arrays of size " + arraySize
);
String out = name+","+numTimes+","+arraySize+","+time;
for (char c : out.toCharArray()) {
bo.write(c);
}
bo.write('\n');
}
public static void insertionSort(int[] array) {
for (int i = 1; i < array.length; i ++) {
int temp = array[i];
int j;
for (j = i - 1; j >= 0 && array[j] > temp; j --) {
array[j+1] = array[j];
}
array[j+1] = temp;
}
}
C:
void benchmark(void (*f)(int *, int), int arr_size, int num_times, char *name,
FILE *fp) {
int *arrs[num_times];
struct timeval start, end;
double t;
for (int i = 0; i < num_times; i++) {
arrs[i] = gen_rand_arr(arr_size);
}
gettimeofday(&start, NULL);
for (int i = 0; i < num_times; i++) {
f(arrs[i], arr_size);
}
gettimeofday(&end, NULL);
for (int i = 0; i < num_times; i++) {
free(arrs[i]);
}
t = ((double)(end.tv_sec * 1000000 + end.tv_usec -
(start.tv_sec * 1000000 + start.tv_usec))) /
1000000;
printf("It took %f seconds to do %s sort %d times on arrays of size %d\n", t,
name, num_times, arr_size);
if (fp != NULL) {
fprintf(fp, "%s,%d,%d,%f\n", name, num_times, arr_size, t);
}
}
void insertion_sort(int *arr, int arr_size) {
for (int i = 1; i < arr_size; i++) {
int temp = arr[i];
int j;
for (j = i - 1; j >= 0 && *(arr + j) > temp; j--) {
arr[j + 1] = arr[j];
}
arr[j + 1] = temp;
}
return;
}
Are there some optimizations that Java is making to run faster that somehow change the algorithm? What is going on here?
Any explanations would be appreciated.
Here is a table of results that might help explain the difference:
Java:
name | rep | size | time |
---|---|---|---|
Insertion | 10000 | 1200 | 1.033 |
Insertion | 10000 | 5000 | 15.677 |
Insertion | 10000 | 12000 | 88.190 |
Selection | 10000 | 1200 | 3.118 |
Selection | 10000 | 5000 | 48.377 |
Selection | 10000 | 12000 | 268.608 |
Radix | 10000 | 1200 | 0.385 |
Radix | 10000 | 5000 | 1.491 |
Radix | 10000 | 12000 | 3.563 |
Bogo | 1 | 11 | 1.330 |
Bogo | 1 | 12 | 0.572 |
Bogo | 1 | 13 | 122.777 |
C:
name | rep | size | time |
---|---|---|---|
Insertion | 10000 | 1200 | 1.766 |
Insertion | 10000 | 5000 | 26.106 |
Insertion | 10000 | 12000 | 140.582 |
Selection | 10000 | 1200 | 4.011 |
Selection | 10000 | 5000 | 60.442 |
Selection | 10000 | 12000 | 340.608 |
Radix | 10000 | 1200 | 0.430 |
Radix | 10000 | 5000 | 1.788 |
Radix | 10000 | 12000 | 4.295 |
Bogo | 1 | 11 | 1.378 |
Bogo | 1 | 12 | 2.296 |
Bogo | 1 | 13 | 1586.73 |
Edit:
I modified the benchmarking function to generate the arrays beforehand, in Java it overflows the heap, and in C it makes it not much faster (around 25%, but Java is still faster).
fwiw I also changed how I indexed things in C from *(arr + i)
to arr[i]
.
Here's the implementation for the random array generator in Java and C:
Java:
public static int[] genRandArray(int arraySize) {
int[] ret = new int[arraySize];
Random rand = new Random();
for (int i = 0; i < ret.length; i ++) {
ret[i] = rand.nextInt(100);
}
return ret;
}
C:
int *gen_rand_arr(int arr_size) {
int *arr;
if ((arr = malloc(arr_size * sizeof(int))) == NULL) {
exit(1);
}
for (int i = 0; i < arr_size; i++) {
arr[i] = arc4random_uniform(100);
}
return arr;
}