I used function mmax
to calculate moving max of a 10-million-length integer vector. I ran it 10 times to calculate the total execution time. The running time for window size 132 (15,025 milliseconds) is 6 times longer than for window size 22 (2,425 milliseconds). It seems the complexity of mmax
is O(nw) rather than O(n), where w is the length of the sliding window).
To check if this is true for other similar products, I tried the same experiment on DolphinDB, a time series database with built-in analytics features (https://www.dolphindb.com/downloads.html ). In contrast, DolphinDB’s mmax
has linear complexity O(n), regardless of the window size: 1,277 milliseconds (window size 132) and 1,233 milliseconds (window size 22).
The hardware being used for this experiment:
Server: Dell PowerEdge R630
Architecure: x86_64
CPU Model Name: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
Total logical CPU cores: 48
Total memory: 256G
Experiment setup
I used KDB+ 4.0 64 bit version and DolphinDB_Linux_V2.00.7(DolphinDB community version: 2 cores and 8GB memory). Both experiments are conducted using 2 cores of CPU.
KDB implementation
// Start the server
rlwrap -r taskset -c 0,1 ./l64/q -p 5002 -s 2
// The code
a:10000000?10000i
\t do[10; 22 mmax a]
2425
\t do[10; 132 mmax a]
15025
DolphinDB implementation
// Start the server
rlwrap -r ./dolphindb -localSite localhost:5002:local5002 -localExecutors 1
// The code
a=rand(10000,10000000)
timer(10) mmax(a,22);
1232.83 ms
timer(10) mmax(a,132);
1276.53 ms
Can any kdb expert confirm the complexity of function mmax? If the built-in function mmax does have the complexity of O(nw), any third-party plugin for kdb to improve the performance?