Something I've noticed when testing code I write is that long-running operations tend to run much longer the first time a program is run than on subsequent runs, sometimes by a factor of 10 or more. Obviously there's some sort of cold cache/warm cache issue here, but I can't seem to figure out what it is.
It's not the CPU cache, since these long-running operations tend to be loops that I feed a lot of data to, and they should be fully loaded after the first iteration. (Plus, unloading and reloading the program should clear the cache.)
Also, it's not the disc cache. I've ruled that out by loading all data from disc up-front and processing it afterwards, and it's the actual CPU-bound data processing that's going slowly.
So what can cause my program to run slow the first time I run it, but then if I close it and run it again, it runs dramatically faster? I've seen this in several different programs that do very different things, so it seems to be a general issue.
EDIT: For clarification, I'm writing in Delphi, though I don't really think this is a Delphi-specific issue. But that means that whatever the problem is, it's not related to JIT issues, garbage collection issues, or any of the other baggage that managed code brings with it. And I'm not dealing with network connections. This is pure CPU-bound processing.
One example: a script compiler. It runs like this:
- Load entire file into memory from disc
- Lex the entire file into a queue of tokens
- Parse the queue into a tree
- Run codegen on the tree to produce bytecode
If I feed it an enormous script file (~100k lines,) after loading the entire thing from disc into memory, the lex step takes about 15 seconds the first time I run, and 2 seconds on subsequent runs. (And yes, I know that's still a long time. I'm working on that...) I'd like to know where that slowdown is coming from and what I can do about it.