-3

I want to calculate the maximum number between two indexes in an array in an efficient way. I will be given a very large number of queries in each query I will be given indexes l and r where I need to find the maximum number between these two indexes

when I tried to solve that problem my solution had a time complexity of o((l-r)*q) where q is the number of queries

#include <bits/stdc++.h>
using namespace std;
int main()
        {
            int number = 1e6;
            vector <int> v(number);
            // the user should enter the elements of the vector
            for(auto &i : v) cin >> i;
            int number_of_queries=1e6;
            for(int i=0; i < number_of_queries; ++i)
            {
                int left_index,right_index,max_number = -1;
                //the user should enter the left and the right index (base zero)
                cin >> left_index >> right_index;
                //searching between the indexes for the maximum number between them
                for(int j=left_index ; j <= right_index; ++j) max_number=max(max_number,v[j]);
                //printing the number
                cout << max_number << "\n";
            }
        }

That's what I came up with

I want to know if there is a way to decrease the time complexity by maybe doing some operations on the array before starting

note that the array contains only positive numbers and it can contain the same number more than one time

Thank you in advance

  • not quite clear. If you can do anything before processing the queries, then you can setup a 2d lookup table containing the maximum element for every start and end position. Then for each query you merely have to look up the result – 463035818_is_not_an_ai Aug 19 '22 at 14:36
  • 2
    Side note: The combination of `#include ` and `using namespace std;` is often an unintended boot to the head. Including the entire C++ Standard library and then putting the thousands of identifiers in it into potential conflict with the global namespace turns your code into a minefield. – user4581301 Aug 19 '22 at 14:37
  • 1
    Using `` _at all_ is a bad idea, encouraged by bad examples and worse learning resources. Don't do it. It's not portable, or standard, or really at all helpful unless saving characters for code golf. – Useless Aug 19 '22 at 14:40
  • why do you think it can be done with better complexity? – 463035818_is_not_an_ai Aug 19 '22 at 14:41
  • Even if you divide the input list into chunks you can save a lot of time by precomputing the max in the chunks so that you're comparing chunks more often than individual values. The kicker is not computing everything every time. If the user asks for range [m,n] and then asks for [m,n+1], why recompute everything? – user4581301 Aug 19 '22 at 14:41
  • [Why should I not #include ?](https://stackoverflow.com/questions/31816095/why-should-i-not-include-bits-stdc-h) – Jesper Juhl Aug 19 '22 at 14:44
  • pre-computing lots of chunks sounds expensive, but you could definitely memoize the query ranges (or fixed-size subranges) as you go for effectively zero overhead and a saving whenever they overlap – Useless Aug 19 '22 at 14:45
  • thank you all for your help I will not use #include again – Ahmed Mohammed Aug 19 '22 at 14:45
  • 2
    Oh, and consider using [`std::max_element`](https://en.cppreference.com/w/cpp/algorithm/max_element) instead of writing that loop yourself. It's already there, you might as well. – Useless Aug 19 '22 at 14:46
  • I think there might be a way or some kind of preprocessing that I can apply to the array that could increase the complexity but I am not sure if that's possible – Ahmed Mohammed Aug 19 '22 at 14:48
  • @AhmedMohammed -- The way to decrease the time complexity is to identify what the problem is behind the scenes. This is a [range minimum query](https://en.wikipedia.org/wiki/Range_minimum_query) problem, and there are many C++ solutions to it. – PaulMcKenzie Aug 19 '22 at 15:25
  • @AhmedMohammed -- And note -- the questions at these competitive coding sites will *never* mention what the proper algorithm, data structure, etc. to use to solve the problem. Instead they mask all of this information by creating cute stories about Joe, Jim, Jane, etc. baking cookies or something like that. It is then your job to try to figure out what they are trying to hide from you, and then implement it. As you can see, range minimum/maximum query is what you're looking for (I believe). – PaulMcKenzie Aug 19 '22 at 15:32
  • [Me like baking cookie, but it take soooooo looooong!](https://www.youtube.com/watch?v=_OKGUAbpj5k) – user4581301 Aug 19 '22 at 15:56

2 Answers2

0

If the array that is begin searched does not change, it could be efficient to create a look-up table to find the maximal element for each range. Since the table is 2-D, it will take a significant amount of time to populate, but there are some strategies to reduce the time it takes.

vector<vector<int>> lookUp();
for(int i=0;i<lookUp.size();i++){
   int max=v[i];
   for(int j=0;j<lookUp.size();j++){
      if(max<v[j])
         max=v[j];
      lookUp[i][j]=max;
   }
}
absolutezero
  • 134
  • 3
0

when I tried to solve that problem my solution had a time complexity of O((l-r)*q) where q is the number of queries

The only real way to reduce this is if the queries overlap. Then we need some way to store some kind of intermediate result (the max element of some given sub-range) so we can reuse it.

The general approach to reusing intermediate results is called dynamic programming, and the specific technique is to memoize a naive function, caching its result for later reuse.

Note that pre-computing maxima for fixed-size partitions adds a fixed (and possibly very large) overhead, but offers no guarantee about the hit rate - most of that effort could be wasted if we get a billion queries for the same small range.

In order to memoize results in this case, we need to chunk them. Only whole chunks can be easily reused.

So, your solution is:

  1. choose a partition size - say we notionally divide our million elements into 100-element chunks

    • a small power of 2 like 16, 32 or 64 is probably optimal in practice, but I'm using round numbers for the example
  2. write the naive max_in_chunk function (which, as mentioned in comments, is really just calling std::max_element)

    • now that this is a fixed-size chunk, there's a chance the compiler can vectorize it, which is always nice, but it's a secondary consideration to the memoization below
  3. memoize that function with a wrapper, say memo_max_in_chunk, like

    // index must be a multiple of 100 or whatever chunk size we selected
    unsigned memo_max_in_chunk(size_t index)
    {
      static std::unordered_map<size_t, unsigned> memo;
    
      auto lookup = memo.insert({index, 0});
      auto &value = lookup.first->second; // value stored in map
      if (lookup.second)
      {
        // insertion succeeded => we don't have an existing result
        value = max_in_chunk(index);
      }
      return value;
    }
    

    It's a little hairy, so read the documentation until you understand it.

  4. For your top-level loop, you now need to split the range [left_index, right_index] into three sections:

    1. the prefix, from left_index to the lowest multiple of 100 between left_index and right_index
    2. the suffix, from the highest multiple of 100 between left_index and right_index, to right_index
    3. a series of 100-element chunks in between them

    Now you can calculate the prefix_max and suffix_max naively, and call memo_max_in_chunk for each of the chunks between. Note that any of the prefix, suffix and chunk sections may be empty.

  5. Finally just take the max of the prefix_max, suffix_max and all the chunk maxima.

Note that there are various optimizations that may improve the actual speed of this (like the vectorization mentioned, or, with some effort, using async evaluation of chunk maxima), but none of them change the complexity.

Even memoization won't improve the complexity if the queries are all smaller than our selected chunk size or never overlap, and there's not a lot we can do about that.

Useless
  • 64,155
  • 6
  • 88
  • 132