Is there a more efficient way to calculate a rolling maximum / minimum than the naive method?

Question

Given a vector of numbers, e.g.

std::vector<double> v{6.4, 5.2, 5.7, 1.8, 8.4, 8.3, 8.0, 9.1, 4.7, 7.6};

I'd like to compute a second vector of maximums of a certain window. The naive method is, of course:

std::vector<double> maximums;
unsigned int window = 3;
for (unsigned int i = window - 1; i < v.size(); i++)
{
    double max = 0; // assume positive domain only
    for (unsigned int j = 0; j < window; j++)
    {
        max = std::max(max, v[i-j]);
    }
    maximums.push_back(max);
}

^{I coded the above without testing; please forgive typos and basic errors—I'm just trying to communicate the gist of the naive method.}

Is there a more efficient method?

At first I thought, "Sure, just keep a sorted list, while keeping track of an iterator pointing to the last inserted element, and remove() that element each time you push() a new value." But that would involve work for insertion (which I believe is O(log n)), and perhaps work for the removal, which I'm not sure about. Would that actually be more efficient than the naive method? (Certainly not more readable. I admit this would all be premature optimization and a folly to implement in my case, but I'm just curious. Imagine an array of millions of values and trying to do a rolling maximum of 1,000 at a time—in that case wouldn't a more efficient algorithm be desirable?)

I prefer non-boost solutions, as well as maximal use of `std` (can `std::priority_queue` be leveraged, somehow?), but all solutions are welcome, since they may help future readers. — Andrew Cheong, Sep 13 '14 at 05:00
@Brian: I take it you don't believe the problem can be solved in less than O(n)? — jxh, Sep 13 '14 at 05:04
@jxh This problem clearly takes Ω(N) time to solve. You must examine every element in the input in order to get the correct result. — Brian Bi, Sep 13 '14 at 05:09
@Brian: Not if the data structure maintains largest (at the cost of O(log n) insertions). — jxh, Sep 13 '14 at 05:10
@jxh Sorry but I don't understand you. That gives you an O(n log n) algorithm since it's O(log n) at each step. The deque-based approach in the duplicate takes O(n) time in total. It's clearly impossible to beat that. — Brian Bi, Sep 13 '14 at 05:18
@Brian: It depends on the use case. If the values come in sporadically, but the max / min is queried with a high frequency, then O(log n) insertion with O(1) query beats an O(n) query on a data structure with O(1) insertion. — jxh, Sep 13 '14 at 05:23
The eventual application is real-time processing, where there can be anywhere between 10 and 1,000 values per second (not that machines can't handle this easily even with the naive algorithm), so if that's what you mean by sporadic, yes, it's sporadic. But of course existing values will never be modified nor inserted amongst, so if that's what you meant, no, it's not sporadic. — Andrew Cheong, Sep 13 '14 at 05:26
Will queries for the max come in more often than 10 times a second? — jxh, Sep 13 '14 at 05:31
I just need to query the latest maximum each time a new value comes in. Never more than once. — Andrew Cheong, Sep 13 '14 at 05:32
Yeah, I think some kind of priority queue will have better big-O (O(log n) per query), but a linear search is lightning fast, so it is O(n) with a very small constant. You will need to profile the approaches to see which one wins out for you. — jxh, Sep 13 '14 at 05:34
@jxh: The algorithm in the answer to the duplicate question is `O(1)` for the CurrentMax operation (or min, as written), and amortized `O(1)` for `insertion`. And the insertion inner loop has a smaller constant than a priority queue. So I don't see any reason to not use it. — rici, Sep 13 '14 at 19:40
@rici: It assumes it is okay to throw away values, which may not be true for an event recorder, for example. Amortized O(1) may not be acceptable for an application that requires predictable latency (like a real time application). — jxh, Sep 13 '14 at 20:49
@jxh: it's not necessary to throw away values. You can instead just remember the position of the current minimum. The insertion cost is O(T) in the worst case (with a very small constant: <1 with modern hardware because you can do several comparisons in parallel with SIMD), so only with very large T and only in very rare cases will it be worse than the priority queue. Even then, you can do it a little at a time if you need to, only finishing it off when you are asked for the minimum. (Note that it can only be O(T) every T'th time. Frequent calls don't incur high costs.) — rici, Sep 13 '14 at 20:58
If you don't throw away the vales, insertion becomes O(T) every time, on average, since assuming random input, the expected position to insert self is position T/2. You are arguing that the computation is more or less bounded by T, but that still doesn't mean a bounded log(T) implementation wouldn't be better for certain circumstances. I already indicated in a comment to the OP that profiling is needed to determine what is acceptable. I think with T of 1000, O(T) is probably okay. — jxh, Sep 14 '14 at 00:21

Is there a more efficient way to calculate a rolling maximum / minimum than the naive method?

0 Answers0