Time Complexity dealing with List of size 10**6

Question

I recently failed a coding challenge which deals with time complexity. I've been tinkering with it in my spare time but still can't get it to work quickly for large lists. I had initially over-thought the problem, refactored it, etc., made some incremental improvements, tried using pandas (which turned out to be much slower), etc.

I am interested in learning what methods I might use to improve the execution speed of this code.

Input: a list with maximum size 10**6 containing unsorted integers in range(1,10**5).

The task is to compute the "total price" from this arbitrary construct and return the "total price" and an ordered list of indices representing those items that were not discounted.

An item at index i has its price discounted by the next smaller/lower item. If there are no smaller values in items[i+1:], the item's price is not discounted (or you can consider it discounted by 0).

Example Input: items = [5, 3, 4, 1, 5]

Example Output: 13, [3, 4]

Here, items[0] is discounted by items[1], items[1] is discounted by items[3], items[2] is discounted by items[3], items[3] and items[4] are not discounted.

So the total price is 13, given by (5-3) + (3-1) + (4-1) + (1-0) + (5-0)

I have a function that solves this pretty quickly for most cases, but as we start to approach the maximum size of the list, it's taking much longer. For example, a list of length 50000 is processed in < 1 second. A list of length 100K is processed in <3 seconds. A list of length 200K takes <10 seconds, and 400K takes about 50 seconds. Running against a million items takes ~1000+ seconds.

For testing, I create a large list like so and then I pass it (or slices of it) to the functions, like:

data = list(np.array(np.random.randint(1,10**5,(10**6)), dtype='int64'))
total, full_price = get_total(data[:100000])

Here is the faster, non-pandas function:

def get_total(data):
    init_total = sum(data)
    items = data[:] 
    size = len(items)
    discount = [get_discount(items.pop(0),items) for i in range(size)]
    full = [i for (i,v) in enumerate(discount) if v == 0]
    total = init_total - sum(discount)
    return total, full, None

def get_discount(this, _items):
    next_lowest_index, discount = next(((x,val) for x, val in enumerate(_items) if val < this), (np.NaN, 0))
    return discount

I mentioned I had tried pandas as well, but this code is much slower even on small lists (n=1000). I tried sorting it by value:

def frame_total(data):
    if type(data) == list:
        data = pd.DataFrame(data)
    data = data[:].sort_values(0, 'index')
    df = pd.DataFrame({ 'val':data[0],
                        'discount': [0] * data.shape[0]
                        }, dtype='int')
    df.discount = [next(iter(df.loc[(df.index > i) & (df.val < row.val)].sort_index().val),0) 
                   for i,row in df.iterrows()]
    total = data.sum() - df.discount.sum()
    full_indices = list(df[df.discount == 0].sort_index().index)
    return total, full_indices, None

And another that does not sort the input data which is not perceptibly faster:

def frame2(data):
    if type(data) == list:
        data = pd.DataFrame(data)
    data = data[:]
    df = pd.DataFrame({ 'val':data[0],
                        'discount': [0] * data.shape[0]
                        }, dtype='int')
    df.discount = [next(iter(df.val[i+1:].loc[df.val < row.val]),0) for i,row in df.iterrows()]
    total = data.sum() - df.discount.sum()
    full_indices = list(df[df.discount == 0].index)
    return total, full_indices, None

Note that the full-price items are more likely to exist towards the end of the list (as i increases, probability that any value < items[i] exists in items[i+1:] decreases). I feel like this is important, but I can't grok how to make use of that.

Solved, thanks @DarrylG and to the explanation here

def get_next_smallest(data,default=0):
    """
        returns the discounted value for all items in a list
        discounted value is the next smaller item in the list, e.g.:
        for any n, the next smallest item is the first item in data[n+1:] < data[n]
        provides O(n) complexity solution.
    """
    discounts=[default for i in data] # stores the corresponding next smaller value
    stack = [] # initialize our empty stack
    for i, this in enumerate(data):
        while len(stack) > 0 and this < data[stack[-1]]:
            discounts[stack.pop()] = this
        stack.append(i)
    return discounts

def get_total(data):
    init_total = sum(data)
    default = 0  # should be a value that will NOT be present in the data, like 0 or -1
    discounts = get_next_smallest(data, default)
    full = [i for i,v in enumerate(discounts) if v == default]
    total = init_total - sum(discounts)
    return total, full

This question would fit better on [Code Review](https://codereview.stackexchange.com). — mkrieger1, Sep 23 '19 at 20:22
@mkrieger. Technically, the code does not perform according to requirements, so it's suitable for SO. OP has done enough research that this is not a question of aesthetics. Also, this looks like a fun problem :) — Mad Physicist, Sep 23 '19 at 20:26
You need to figure out an algorithm that doesn't require searching the entire rest of the list for each element, because that's `O(n**2)`. Consider the worst case example `[2, 2, 2, 2, 2, ..., 1]`. I suspect it involves finding local maxima and minima. — Barmar, Sep 23 '19 at 20:31
@Barmar my understanding is that the `next` generator in the `get_discount` function should avoid that pitfall. I test that method on a list [2,2,2...,1] with size 10**6 and the return value is *immediately* given in console. — David Zemens, Sep 23 '19 at 20:37
There's an algorithm that finds the "next smaller element" of each element in an array in O(n) (example of implementation is https://stackoverflow.com/questions/9493853/given-an-array-find-out-the-next-smaller-element-for-each-element). Seems this could be easily used to find total_cost in O(n). — DarrylG, Sep 23 '19 at 20:40
Just taking a quick look at your code, `items.pop(0)` is pretty expensive if it's performed many times. — iz_, Sep 23 '19 at 20:43
A hint: you can also calculate it as `5 - (1 * 3) + 3 + 4 - (2 * 1) + 1 + 5`. — Klaus D., Sep 23 '19 at 20:50

DarrylG · Answer 1 · 2019-09-23T21:15:52.043

Here's an algorithm that's O(n)--using algorithm from Given an array, find out the next smaller element for each element to find next smaller element

def find_next_smaller_elements(xs):
 " finds next smallest element in O(n) "
    ys=[-1 for x in xs]
    stack=[]
    for i,x in enumerate(xs):
        while len(stack)>0 and x<xs[stack[-1]]:
           ys[stack.pop()]=x
        stack.append(i)
    return ys

def get_total(data):
" Computes desired cost function "
    next_smaller = find_next_smaller_elements(data)

    return sum([ x[0] if x[1] == -1 else x[0]-x[1]  for x in list(zip(data, next_smaller))])

Test (small list)

data = [5, 3, 4, 1, 5]
print(get_total(data)) # 13

Timing Test

for k in [1000, 10000, 100000, 1000000]:
    data = list(np.array(np.random.randint(1,10**5,k, dtype='int64')))
    t0 = time.time()
    ans = get_total(data)
    print(k, time.time()-t0)

Results:

No.Items => Time (seconds)
1000 => 0.0029
10000 => 0.0369
100000 => 0.2059
1000000 => 1.96400

Thus a million items in only 2 seconds.

wow. i'm getting a million items in 0.16 seconds. unbelievable. — David Zemens, Sep 24 '19 at 03:45
@DavidZemens--wow, that's amazing going from ~1000+ seconds to 0.16 seconds. — DarrylG, Sep 24 '19 at 09:00

Mad Physicist · Answer 2 · 2019-09-23T20:53:28.927

Here's a hint: you can compute the ordered indices in a single pass. The trick is to step along the list backwards:

def find_undiscounted(data):
    skipped = [len(data) - 1]
    current = data[-1]
    for i in range(len(data) - 2, -1, -1):
        if current >= data[i]:
            skipped.append(i)
            current = data[i]
    return skipped[::-1]

A comprehensive solution will require a stack, but can clearly be done in a single pass. Don't forget to use collections.deque if you decide to implement it that way.

score 1 · Accepted Answer · answered Sep 25 '19 at 15:58

By iterating on your data backwards, as suggested by @Mad Physicist, you can get an algorithm needing much less memory, and also being faster:

def get_total(data):
    tot = sum(data)
    smallest_tail = deque()
    no_discount = []
    i = len(data) - 1 # manually handle the index
    for x in reversed(data):
        while smallest_tail:
            s = smallest_tail[-1]
            if s >= x: # s won't be next smaller for anyone because of x
                smallest_tail.pop()
            else:
                tot -= s
                break
        if not smallest_tail:
            no_discount.append(i)
        smallest_tail.append(x)
        i -= 1
    return tot, list(reversed(no_discount))

comparing with your current solution (on my machine):

:data = list(np.array(np.random.randint(1, 10**5, 10**6, dtype='int64')))
:get_total_dz(data) == get_total(data)
True
:%timeit r = get_total_dz(data) # yours, replacing 'len(stack) > 0' with 'stack'
672 ms ± 6.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
:%timeit r = get_total(data) # mine
435 ms ± 2.29 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Can you elaborate on why you chose to use `deque` instead of `list`? — David Zemens, Sep 25 '19 at 17:36
I used `deque` instead of `list` because it's slightly faster (and once again @Mad Physicist advertised it), try it at home ;) With `list` I get around 480ms on my machine. For some explenations, you can look e.g. at https://stackoverflow.com/questions/23487307/python-deque-vs-list-performance-comparison — One Lyner, Sep 26 '19 at 07:54

Time Complexity dealing with List of size 10**6

Solved, thanks @DarrylG and to the explanation here

3 Answers3