10

Find the sum of maximum difference possible from contiguous subset of a given array.

We are given an array arr[] of n non-negative integers (repeated elements allowed), find out the sum of maximum difference possible from contiguous subsets of the given array.

Suppose max(s) represents the maximum value in any subset ‘s’ whereas min(s) represents the minimum value in the set ‘s’. We need to find the sum of max(s)-min(s) for all possible subsets.

Input : arr[] = {1, 2, 3}
Output : result = 4

Explanation :

All possible subset and for each subset s,
max(s)-min(s) are as :
SUBSET    |  max(s) | min(s) | max(s)-min(s)
{1, 2}    |  2      |  1     |   1
{2, 3}    |  3      |  2     |   1
{1, 2, 3} |  3      |  1     |   2
Total Difference sum = 4
Note : max(s) - min(s) for all subset with 
single element must be zero.

Constraints:

Array size can be from 1 to 10 power 5, also each element in array can be from 1 to 10 power 5.

This is the code taken from here, but this code checks all possible subsets instead of contiguous subsets:

public static int MOD = 1000000007;
      
    // function for sum of max min difference 
    public static long maxMin (int arr[], int n) 
    {
        // sort all numbers
        Arrays.sort(arr);
          
        // iterate over array and with help of 
        // horner's rule calc max_sum and min_sum
        long min_sum = 0, max_sum = 0;
        for (int i = 0; i < n; i++)
        {
            max_sum = 2 * max_sum + arr[n - 1 - i];
            max_sum %= MOD;
            min_sum = 2 * min_sum + arr[i];
            min_sum %= MOD;
        }
      
        return (max_sum - min_sum + MOD)%MOD;
    }

So how to get only contiguous subsets and solve this with less time complexity.

RBarryYoung
  • 55,398
  • 14
  • 96
  • 137
learner
  • 6,062
  • 14
  • 79
  • 139
  • What does the "contiguous" bit actually mean? If you just pick "the whole array" as a contiguous subset of the array, that contains a min and a max, and the sum of those is the answer? – Andy Turner Dec 09 '21 at 17:36
  • @AndyTurner, here contiguous means elements which are next to each other. For example in array [1,2,3] the subset [1,3] is not valid in my case as they are not adjacent. The valid subsets are [1,2],[2,3],[1,2,3] – learner Dec 09 '21 at 18:47
  • right. So, the entire array is a contiguous (non-strict) subset of the entire array, and its max and min elements have the maximum difference. – Andy Turner Dec 09 '21 at 18:51
  • I need sum of all max min difference as mentioned in my post. As per your comment I understand that for array [1,2.3] max is 3, min is 1 so max - min = 3-1=2. But in my case it should be 4 as mentioned in my post. – learner Dec 09 '21 at 18:54
  • Out of curiosity, is there a particular real world application of this problem? Or is this an academic exercise? – eliangius Dec 15 '21 at 15:12
  • what is the time complexity you are looking for? – user1984 Dec 15 '21 at 18:45

4 Answers4

13

You can do this in O(n) time and space.

The technique is to use the algorithm for all nearest smaller values. First, break the problem into two parts:

  1. Find the sum of all subarray maximums
  2. Find the sum of all subarray minimums, and subtract this from the first sum.

The solution for both problems is identical apart from exchanging all occurrences of 'less than' with 'greater than', so I'll describe the minimums case only.

For each element A[i] of the array, you can ask: 'How many subarrays have A[i] as their minimum element?' To deal with duplicates, assume we always take the rightmost occurrence of a minimum element within a subarray as the 'representative' element.

The question transforms to finding how far to the left of A[i] we can go before seeing an element strictly smaller than A[i], and how far to the right of A[i] we can go before seeing an element as small as A[i]. Multiply these two distances to get all possible choices of left and right endpoints among subarrays that have A[i] as their minimum element. We can find both of these directly with the 'all nearest smaller values' algorithm, and solve the rest of the problem like so (pseudocode):

 1. For each position i in the array A, let previous_smaller[i]
    be the largest index j such that A[j] < A[i] and 0 <= j < i,
    or -1 if there is no such index.

 2. For each position i in the array A, let next_smaller_or_equal[i]
    be the smallest index j such that A[j] <= A[i] and i < j < n,
    or n if there is no such index.

 3. Return the sum over all i, 0 <= i < n, of 
    (A[i] * 
    (next_smaller_or_equal[i] - i) * 
    (i - previous_smaller[i]))

There are several implementations of all nearest smaller values in the answers to this question, for example, and pseudocode in the Wikipedia article. To find 'next smaller values' instead of 'previous smaller values', simply run the algorithm on a reversed array A (or just traverse A in reverse order, from A[n-1] down to A[0]).

Sample implementation of the whole algorithm (in Python):

def max_difference_sum(A: List[int]) -> int:
    """Given an array of integers A, compute the 
    sum over all subarrays B of max(B) - min(B)
    by using nearest smaller values"""
    
    n = len(A)

    # Convention to take the rightmost min or max in each subarray.
    previous_smaller = list(range(n))
    next_smaller_or_equal = list(range(n))

    previous_larger = list(range(n))
    next_larger_or_equal = list(range(n))

    # Compute the previous larger and smaller in a single loop.
    for i in range(n):
        j = i - 1
        while j >= 0 and A[j] >= A[i]:
            j = previous_smaller[j]
        previous_smaller[i] = j

        j = i - 1
        while j >= 0 and A[j] <= A[i]:
            j = previous_larger[j]
        previous_larger[i] = j

    for i in reversed(range(n)):
        j = i + 1
        while j < n and A[j] > A[i]:
            j = next_smaller_or_equal[j]
        next_smaller_or_equal[i] = j

        j = i + 1
        while j < n and A[j] < A[i]:
            j = next_larger_or_equal[j]
        next_larger_or_equal[i] = j

    max_sums = sum(A[i]
                   * (next_larger_or_equal[i] - i)
                   * (i - previous_larger[i])
                   for i in range(n))

    min_sums = sum(A[i]
                   * (next_smaller_or_equal[i] - i)
                   * (i - previous_smaller[i])
                   for i in range(n))
    
    return max_sums - min_sums
kcsquared
  • 5,244
  • 1
  • 11
  • 36
  • Hi @kcsquared, could you, please, what values should be used at the step 3 in case if the index is out of bounds (-1 or n)? – Andriy Slobodyanyk Dec 21 '21 at 11:03
  • @AndriySlobodyanyk You would use -1 or n in those respective cases. I've added some Python code for the whole algorithm to hopefully make it clearer. – kcsquared Dec 21 '21 at 11:21
  • @kcsquared, this is really useful and very clearly explained. Question about initializing the arrays such as ```previous_smaller```: the initial values aren't used, so would it be more efficient (and/or clearer) to set to the value ```None``` -- e.g. ```previous_smaller = [None] * n```? – mherzog May 08 '22 at 13:43
  • 1
    @mherzog I think that would work here. Initializing to None might be ever so slightly faster. I don't know whether it would be clearer; that's a personal preference. There's arguments for using [None] * n, [-1] * n or list(range(n)), but using None has the advantage of making it an error to use uninitialized values. – kcsquared May 08 '22 at 21:52
2

Let's use the induction method.

  1. Assume we solved the problem somehow for the array of size N and know the desired sum.
  2. Let's find the solution if the element A[n+1] is added.
  3. We need to calculate the sums only for all sequences that include A[n+1].
    • A[0], A[1], A[2], ..., A[n+1]
    • A[1], A[2], ..., A[n+1]
    • ...
    • A[n], A[n+1]
    All other contiguous subsets were calculated at the previous step somehow.
  4. In order to calculate their minimums and maximums let's examine them in the reverse order.
    • A[n+1], A[n]
    • A[n+1], A[n], A[n-1]
    • ...
    • A[n+1], A[n], ..., A[0]
    This gives us an ability to iterate them and find their extremes in a single loop.
  5. So the code is
    int a[] = {1, 2, 3};
    
    long sum = 0;
    for (int i = 1; i < a.length; i++) {
        int min = a[i];
        int max = a[i];
    
        for (int j = i - 1; j >= 0; j--) {
            int current = a[j];
            if (current < min) min = current;
            if (current > max) max = current;
            sum += max - min;
        }
     }
     
     System.out.println("Sum = " + sum);
    
  6. The solution complexity is O(n^2) as there are two nested loops.
Andriy Slobodyanyk
  • 1,965
  • 14
  • 15
  • I got lost somehow on the path from the Induction method to the final code, somewhere between step 3 and 5. Nevertheless, the code works, so the reason must be my limited capabilities … – tquadrat Dec 21 '21 at 11:57
1

You can achieve this using stream:

public static int difference(int[] arr) {
    int size = arr.length;
    
    return IntStream.range(0, size)
            .flatMap(i -> IntStream.range(i + 1, size)
                    .mapToObj(j -> Arrays.stream(arr, i, j + 1).summaryStatistics())
                    .mapToInt(stat -> stat.getMax() - stat.getMin()))
            .sum();
}

Alternatively, as noticed by @kcsquared, you can use 2 stream, one for max sum and the other for min sum, and subtract them. This approach also avoids unnecessary boxing and unboxing.

public static int difference2(int[] arr) {
    int size = arr.length;
            
    int max = IntStream.range(0, size)
            .flatMap(i -> IntStream.range(i + 1, size)
                    .map(j -> Arrays.stream(arr, i, j + 1).max().getAsInt()))
            .sum();
    
    int min = IntStream.range(0, size)
            .flatMap(i -> IntStream.range(i + 1, size)
                    .map(j -> Arrays.stream(arr, i, j + 1).min().getAsInt()))
            .sum();
    return max - min;
}
Oboe
  • 2,643
  • 2
  • 9
  • 17
0

Since the proposed partial solution is already paying the sort cost, an initial time optimization could pre transform the input arr to a list of (i, arr[i]), then sort by the arr[i]values & in the for loop skip over sorted_tuple_arr with non consecutive i values.

eliangius
  • 326
  • 3
  • 10