I am trying to measure DIRECT IO performance. By my understanding DIRECT I/O ignores the page cache and goes to the underlying device to fetch the data. Therefore, if we are reading the same file over and over again, DIRECT I/O would be slower compared to accesses involving the page cache as the file would be cached.
#define _GNU_SOURCE
#include <stdio.h>
#include <errno.h>
#include <fcntl.h>
#include <time.h>
char *DIRECT_FILE_PATH = "direct.dat";
char *NON_DIRECT_FILE_PATH = "no_direct.dat";
int FILE_SIZE_MB = 100;
int NUM_ITER = 100;
void lay_file(int direct_flag) {
int flag = O_RDWR | O_CREAT | O_APPEND | O_DIRECT;
mode_t mode = 0644;
int fd;
if (direct_flag) {
fd = open(DIRECT_FILE_PATH, flag, mode);
} else {
fd = open(NON_DIRECT_FILE_PATH, flag, mode);
}
if (fd == -1) {
printf("Failed to open file. Error: \t%s\n", strerror(errno));
}
ftruncate(fd, FILE_SIZE_MB*1024*1024);
close(fd);
}
void read_file(int direct_flag) {
mode_t mode = 0644;
void *buf = malloc(FILE_SIZE_MB*1024*1024);
int fd, flag;
if (direct_flag) {
flag = O_RDONLY | O_DIRECT;
fd = open(DIRECT_FILE_PATH, flag, mode);
} else {
flag = O_RDONLY;
fd = open(NON_DIRECT_FILE_PATH, flag, mode);
}
for (int i=0; i<NUM_ITER; i++) {
read(fd, buf, FILE_SIZE_MB*1024*1024);
lseek(fd,0,SEEK_SET);
}
close(fd);
}
int main() {
lay_file(0);
lay_file(1);
clock_t t;
t = clock();
read_file(1);
t = clock() - t;
double time_taken = ((double)t)/CLOCKS_PER_SEC; // in seconds
printf("DIRECT I/O read took %f seconds to execute \n", time_taken);
t = clock();
read_file(0);
t = clock() - t;
time_taken = ((double)t)/CLOCKS_PER_SEC; // in seconds
printf("NON DIRECT I/O read took %f seconds to execute \n", time_taken);
return 0;
}
Using the above code to measure DIRECT I/O performance tells me that DIRECT I/O is faster than regular IO involving the page cache. This is the output
** DIRECT I/O read took 0.824861 seconds to execute NON DIRECT I/O read took 1.643310 seconds to execute **
Please let me know if I am missing something. I have an NVMe SSD as the storage device. I wonder if it is too fast to really show the difference in performance of when the page cache is used and not used.
UPDATE:
Changing the buffer size to 4KB shows that DIRECT I/O is slower. The large buffer size was probably making large sequential writes to the underlying device which is more helpful but would still like some insights.
** DIRECT I/O read took 0.000209 seconds to execute NON DIRECT I/O read took 0.000151 seconds to execute **