EDIT a fairly large edit for wording and adding more detail throughout.
A few thoughts:
Hotspot kicks in when a piece of code is being executed significantly more than other pieces (it's the hot spot of the program). This makes that piece of code significantly faster (for the normal path) from that point forward. The rate of call after the hotspot compilation is not important, so I don't think this is causing the effect you are mentioning.
Is the effect real? It's very easy to trick yourself with statistics. Not saying you are, but be sure that all your runs are included in the result, and that all other effects (such as other programs, activity, and your monitoring program are the same in all cases. I have more than one had my monitoring program, such as top, cause a difference in behaviour). On one occasion, the performance of the application went up appreciably when the caches warmed up on the database - there was memory pressure from other applications on the same DB instance.
The Operating System and/or CPU may well be involved. The OS and CPU both actively and passively do things to improve the responsiveness of the main program as it moves from being mainly running to being mainly waiting for I/O and vice versa, including:
- OS paging memory to disk while it's not being used, and back to RAM when the program is running
- OS will cache frequently used disk blocks, which again may improve the application performance
- CPU instruction and memory caches fill with the active program's instruction and data
Java applications particularly sensitive to memory paging effects because:
- A typical Java application server will pre-allocate almost all free memory to Java. The large memory makes the application inherently more sensitive to memory effects
- The generational garbage collector used to manage Java memory ends up creating new objects over a lot of pages, so each request to the application will need more page requests than in other languages. (this is true principally for 'new' objects that have not been through many garbage collections. Objects promoted to the permanent generation are actually very compactly stored)
- As most available physical memory is allocated on the system, there is always a pressure on memory, and the largest, least recently run application is a perfect candidate to be pages out.
With these considerations, there is much more probability that there will be page misses and therefore a performance hit than environments with smaller memory requirements. These will be particularly manifest after Java has been idle for some time.
If you use Solaris or Mac, the excellent dTrace can trace memory and disk paging specific to an application. The JVM has numerous dTrace hooks that can be used as triggers to start and stop page monitoring.
On Solaris, you can use large memory pages (even over 1GB in size) and pin them to RAM so they will never be paged out. This should eliminate the memory page problem stated above. Remember to leave a good chunk of free memory for disk caching and for other system/maintenance/backup/management apps. I am sure that other OSes support similar features.
TL/DR: The currently running program in modern operating systems will appear to run faster after a few seconds as the OS brings the program and data pages back from disk, places frequently used disk pages in disk cache and the OS instruction and data caches will tend to be "warmer" for the main program. This effect is not unique to the JVM but is more visible due to the memory requirements of typical Java applications and the garbage collection memory model.