Efficient way to iterate over sublists of a huge list in python

Question

So I need to find an efficient way to iterate over the big list in python.

Given: array of integers and number(length of a sublist)

Constraints: array up to 100K elements, elements in range(1,2**31)

Task: For every sublist find difference between max and min number. Print out the biggest difference.

Ex: [4,6,3,4,8,1,9], number = 3
As far as I understand I have to go through every sublist:

[4,6,3]  max - min = 6 - 3 = 3
[6,3,4]  3
[3,4,8]  5
[4,8,1]  7
[8,1,9]  8

final max = 8

So my solution is:

import time

def difference(arr, number): 
    maxDiff = 0
    i = 0
    while i+number != len(arr)+1:
        diff = max(arr[i:i+number]) - min(arr[i:i+number])

        if diff > maxDiff:
            maxDiff = diff

        i += 1

    print maxDiff


length = 2**31
arr = random.sample(xrange(length),100000)   #array wasn't given. My sample
t0 = time.clock()
difference(arr,3)
print 'It took :',time.clock() - t0

Answer:

2147101251
It took : 5.174262

I also did the same with for loops which gives worse time:

def difference(arr,d):
    maxDiff = 0
    if len(arr) == 0:
        maxDiff = 0
    elif len(arr) == 1:
        maxDiff = arr[0]
    else:
        i = 0
        while i + d != len(arr)+1:

            array = []
            for j in xrange(d):
                array.append(arr[i + j])

            diff = max(array) - min(array)

            if diff > maxDiff:
                maxDiff = diff

            i += 1
    print maxDiff

length = 2**31
arr = random.sample(xrange(length),100000)      #array wasn't given. My sample
t0 = time.clock()
difference(arr,1000)
print 'It took :',time.clock() - t0

Answer:

2147331163
It took : 14.104639

My challenge was to reduce time to 2 sec.

What would be the most efficient way to do this???

Based on answer and comment of @rchang and @gknicker I was able to get improvement. I'm wondering if there is something else I can do?

def difference(arr,d):
    window = arr[:d]
    arrayLength = len(arr)
    maxArrayDiff = max(arr) - min(arr)

    maxDiff = 0

    while d < arrayLength:

        localMax = max(window)
        if localMax > maxDiff:
            diff = localMax - min(window)

            if diff == maxArrayDiff:
                return diff
                break
            elif diff > maxDiff:
                maxDiff = diff

        window.pop(0)
        window.append(arr[d])

        d += 1

    return maxDiff


#arr = [3,4,6,15,7,2,14,8,1,6,1,2,3,10,1]
length = 2**31
arr = random.sample(xrange(length),100000)
t0 = time.clock()
print difference(arr,1000)
print 'It took :',time.clock() - t0

Answer:

2147274599
It took : 2.54171

Not bad. Any other suggestions?

Two shortcut optimization suggestions: 1) don't evaluate min() if max() <= maxDiff. 2) stop evaluating altogether if maxDiff has reached the threshold difference for length. — gknicker, Dec 13 '14 at 08:13
perhaps you should consider using `numpy` which is a module specialized to handle (large) numerical data. It is usually quite a bit faster that pure python. — PeterE, Dec 13 '14 at 12:48
That is an excellent point about `numpy` @Peter. I've not yet been able to get a `numpy` implementation to outperform the answer I posted - it's possible this body of data is not sufficiently large to reap the benefits. Or there could be something jacked up about how I'm doing this with `numpy`. I'll explore later today if I get free time. — rchang, Dec 13 '14 at 14:14

PeterE · Accepted Answer · 2014-12-14T15:15:23.960

Here is my attempt at solving this.

I have experimented and measured quite a bit and have come to the following conclusions:

The subset_length has significant influence on the performance.
numpy min/max is much faster than the build in functions, but only for large arrays, below, lets say 50, the buildins are faster.
This has the effect that for subset_length of
- below 10 your latest version is the fastest
- between 10 and 50 a version of my algorithm without numpy (not posted (yet)) is fastest
- above 50 my algorithm is the fastest
- at 1000 this algorithm outperforms yours by a factor of 100

be aware that array has to be a numpy.array() and subset_length must be 3 or more.

def difference_np(array, subset_length):
    assert subset_length > 2, "subset_length must be larger than 2"
    length = array.size
    total_diff = array.max()-array.min()

    current_min = array[:subset_length].min()
    current_max = array[:subset_length].max()
    max_diff = current_max - current_min
    max_diff_index = 0
    index = subset_length
    while index < length:
        i_new = index
        i_old = index-number
        index += 1     
        new = array[i_new]            
        old = array[i_old]

        # the idea here is to avoid calculating the
        #   min/max over the entire subset as much as possible,
        #   so we treat every edge case separately.
        if new < current_min:
            current_min = new
            if old == current_max:
                current_max = array[i_old+1:i_new-1].max()
        elif new > current_max:
            current_max = new
            if old == current_min:
                current_min = array[i_old+1:i_new-1].min()
        elif old == current_min:
            current_min = array[i_old+1:i_new].min()
        elif old == current_max:
            current_max = array[i_old+1:i_new].max()
        else:
            continue

        current_diff = current_max-current_min
        if current_diff > max_diff:
            max_diff = current_diff
            max_diff_index = i_old

        # shortcut-condition
        if max_diff == total_diff:
            print('shortcut at', (index-1)/(length-subset_length), '%' )
            break

    return max_diff, max_diff_index

I'm not certain if the shortcut-condition is all that effective, as it rarely applies and costs two full iterations of the input array.

EDIT

An other margin for improvement exists if the algorithm uses list.pop(0). As list is optimized for right hand side operations, list.pop(0) is relatively expensive. With collections.deque there exists an alternative that provides a fast left hand side pop: deque.popleft(). Is brings quite a bit of improvement to the overall speed.

Here the non-numpy collections.deque based version of my algorithm:

def difference_deque(array, subset_length):
    assert subset_length > 1, "subset_length must be larger than 1"
    length = len(array)
    total_diff = max(array)-min(array)

    current_slice = collections.deque(array[:subset_length])
    current_min = min(current_slice)
    current_max = max(current_slice)
    max_diff = current_max - current_min
    max_diff_index = 0

    index = subset_length
    while index < length:
        i_new = index
        i_old = index-number
        index += 1     
        new = array[i_new]            
        old = current_slice.popleft()

        if new < current_min:
            current_min = new
            if old == current_max:
                current_max = max(current_slice)
            current_slice.append(new)
        elif new > current_max:
            current_max = new
            if old == current_min:
                current_min = min(current_slice)
            current_slice.append(new)
        elif old == current_min:
            current_slice.append(new)
            current_min = min(current_slice)
        elif old == current_max:
            current_slice.append(new)
            current_max = max(current_slice)
        else:
            current_slice.append(new)
            continue

        current_diff = current_max-current_min
        if current_diff > max_diff:
            max_diff = current_diff
            max_diff_index = i_old+1

        # shortcut-condition
        if max_diff == total_diff:
            print('shortcut at', (index-1)/(length-number), '%' )
            break

    return max_diff, max_diff_index

It skews the runtime rankings a bit: - up to 10 your algorithm (with deque) is best - up to 100 my algorithm (with deque) is best - above 100 my algorithm (with numpy) is best

score 0 · Answer 2 · answered Dec 13 '14 at 12:40

I came up with this optimization that might shave some time off your first implementation. Instead of using slices to isolate the numbers to consider for each iteration, I used a slice one time to initialize the "window". On each iteration, the "rightmost" element gets added to the window and the "leftmost" element gets evicted.

import time
import random

def difference(arr, number):
  thisSlice = arr[:number-1]
  arrSize = len(arr)
  maxDiff = -1000

  while number < arrSize:

    # Put the new element onto the window's tail
    thisSlice.append(arr[number])

    thisDiff = max(thisSlice) - min(thisSlice)
    if thisDiff > maxDiff: maxDiff = thisDiff
    number += 1

    # Get rid of the "leftmost" element, we won't need it for next iteration
    thisSlice.pop(0)

  print maxDiff

if __name__ == '__main__':
    length = 2**31
    arr = random.sample(xrange(length),100000)
    t0 = time.clock()
    difference(arr, 1000)
    print 'It took :', time.clock() - t0

At least on my laptop, this doesn't come down to below 2 seconds, but I did see some gains compared to the first implementation you posted. On average, your first solution ran on my laptop between 4.2 to 4.3 seconds. This piecemeal window construction version ran on average between 3.5 and 3.6 seconds.

Hope it helps.

score 0 · Answer 3 · edited May 23 '17 at 10:27

I think you can use one of the various numpy rolling window functions using as_strided magic -- say the one I've just stolen from here:

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

Using your original difference but with return instead of print, and with arr being a numpy array:

>>> w = 3
>>> %timeit old_d = difference(arr, w)
1 loops, best of 3: 718 ms per loop
>>> %timeit q = rolling_window(arr, w); ma=q.max(1);mi=q.min(1); new_d=(ma-mi).max()
100 loops, best of 3: 5.68 ms per loop

and

>>> w = 1000
>>> %timeit old_d = difference(arr, w)
1 loops, best of 3: 25.1 s per loop
>>> %timeit q = rolling_window(arr, w); ma=q.max(1);mi=q.min(1); new_d=(ma-mi).max()
1 loops, best of 3: 326 ms per loop

Efficient way to iterate over sublists of a huge list in python

3 Answers3