I use Java in this question but this really applies to all modern app development. Our "environment pipeline", like many of them, looks like this:
- Developer sandbox
- Continuous integration & testing
- QA/Staging
- Production
The hardware, available RAM & CPU in each of these environments is different: my laptop is a 2GB dual-core Windows machine. Testing runs on a 4GB machine. Production is two (load-balanced) 8GB, quad-core servers.
Obviously the same code will perform differently when it runs on these different machines (environments).
I was thinking about writing automated performance tests for some of my classes that would be of the form:
private static final long MAX_TIME = 8000;
@Test
public final void perfTestSomething() {
long start = System.currentTimeInMillis();
// Run the test
long end = System.currentTimeInMillis();
assertTrue((end - start) < MAX_TIME);
}
Thus the automated performance test fails if the test takes more than, say, 8 seconds to run.
But then this realization dawned on me: the code will run different in different environments, and will run differently depending on the state of JVM and GC. I could run the same test 1000 times on my own machine and have wildly different results.
So I ask: how does one accurately/reliably define & gauge automated performance tests as code is promoted from one environment to the next?
Thanks in advance!