Why execution of the same program is a lot faster after the first time

Question

I'm working on a C program (Ubuntu 14.04) that does basically:

Opens a 1GB file
Reads it by buffer of 1MB
Looks for some objects in the buffer
Computes the MD5 signature of each object found

My program take 10 secondes the first time to achieve this, and then only 1 seconde the next times (even if I work on a second copy of initial file).

I know that this has something to do with caching, does my program work on cached data after the first time ? or directly show cached results without doing any computation ?

int main(int argc, char** argv) {
unsigned char buffer[BUFFER_SIZE];
int i, number, count = 0;
int start, end = 0;
FILE *file;
file = fopen("/dump/ram.lime", "r");
if (file != NULL) {
    while ((number = fread(buffer, 1, BUFFER_SIZE, file)) > 0) {           
        for (i = 0; i < number; i++) {
            find_object(buffer, &start, &end);
            md5_compute(&buffer[start], end - start);
        }
    }
} else {
    printf("errno %d \n", errno);
}
printf("count = %d \n", count);
return (EXIT_SUCCESS);

}

It might take almost 10 seconds to read a 1GB file from disk. When in memory, the OS would keep it there for a while as someone might want to read it again. — Bo Persson, Oct 07 '15 at 08:33
The opened file will be kept in the cache disk (in RAM if you want). You cannot not pilot the caching, you can only flush it. — Ôrel, Oct 07 '15 at 08:42
@LPs I've put a the main function of my code and my linux version — Yacine Hebbal, Oct 07 '15 at 08:47
I checked my RAM (with system monitor) after the first execution, but I don't see any important RAM usage — Yacine Hebbal, Oct 07 '15 at 08:49
Possible duplicate of [What can cause a program to run much faster the second time?](http://stackoverflow.com/questions/7561362/what-can-cause-a-program-to-run-much-faster-the-second-time) — davmac, Oct 07 '15 at 08:55

Basile Starynkevitch · Accepted Answer · 2015-10-07T10:47:46.363

Because the second time, most of your program code and most of the file data are already sitting in the page cache (and the kernel won't need any disk I/O to get them into RAM)

You'll likely to observe similar speedup if you run any other program (like cat or wc) on that big file which reads it sequentially before running your own code.

See also posix_fadvise(2), sync(2) & the Linux specific readahead(2) & http://www.linuxatemyram.com/ ; use the free(1) command before, during, and after running your program to measure memory. Read also about proc(5), since /proc/ contains a lot of useful pseudo-files describing the kernel state of your machine or your process.

Use also time(1), perhaps as /usr/bin/time -v, to benchmark several times your program. See also time(7) & getrusage(2) ...

In fact, I can see an additional 1GB of used RAM with free command but not with system monitor application. Thanks :) — Yacine Hebbal, Oct 07 '15 at 09:06

Why execution of the same program is a lot faster after the first time

1 Answers1