How can I optimize my Python code to run faster?

Question

Define an element of a list of items to be a dominator if every element to its right (not just the one element that is immediately to its right) is strictly smaller than that element. Note how by this definition, the last item of the list is automatically a dominator. This function should count how many elements in items are dominators, and return that count. For example, the dominators of the list [42, 7, 12, 9, 13, 5] would be the elements 42, 13 and 5. The last element of the list is trivially a dominator, regardless of its value, since nothing greater follows it.

I am working on this practice problem and have a code that works, however it's quite slow. How can I best optimize this to run faster?

def count_dominators(items):
    item_count = len(items)
    count = 0
    if item_count==0:
        return count;
        
    count+=1;
    
    if item_count==1:
        return count;
    
    for i in range(0,len(items)-1):
        flag=1
        for j in range(i+1,len(items)):
            
            if(items[j]>=items[i]):
                flag=0;
                break;
          
       
        if(flag==1):
            count+=1;
    return count

Why not iterate over the list in reverse order, keeping a variable for the maximum value reached so far? — Oli, Jul 28 '22 at 21:23
Questions related to improving the efficiency of working code are more suited to be asked in the [Code Review Forum](https://codereview.stackexchange.com/). Code Review is a question and answer site for peer programmer code reviews. Please read the relevant guidance related to how to properly ask questions on this site before posting your question. — itprorh66, Jul 28 '22 at 21:23
Every time you scan to check for a dominator, you keep the position of the max element found in his right side. If the number is a dominator then to find the next dominator you can start directly from such max. In the array example, when you scan right of 42 you find 13 as max value, when you confirm 42 as dominator you can start looking for the next one from 13. — Hardest, Jul 28 '22 at 21:28
A side-note: Your kinda ugly use of a `flag` variable is unnecessary on Python, [thanks to the `else:` clause for `for` and `while` loops, which exists precisely to enable this sort of thing](https://stackoverflow.com/q/9979970/364696). You could remove `flag=1`, `flag=0`, and replace `if(flag == 1):` with just `else:` and get the same effect. — ShadowRanger, Jul 28 '22 at 21:54

Colim · Answer 1 · 2022-07-28T23:24:54.050

3

import numpy as np

def count_dominators(items):
    dominators = np.unique(np.maximum.accumulate(seq[::-1]))
    return dominators.size, dominators

seq = [42, 7, 12, 9, 13, 5]
print(count_dominators(seq))
>>>(3, array([5, 13, 42]))

edited Jul 28 '22 at 23:24

answered Jul 28 '22 at 21:29

Colim

1,283
3
11

When using numpy I get an error saying no module found. I'm not very familiar with this method, what could I be doing wrong here? – Jul 29 '22 at 01:41
You have to install numpy https://numpy.org/install/ – Colim Jul 29 '22 at 01:42

score 2 · Answer 2 · answered Jul 28 '22 at 21:50

Iterating in reverse order will reduce this from a O(n²) to O(n) as you'll keep a running maximum as you go, without needing to rescan all trailing elements for each new element. So without relying on complex utilities you didn't write, you can just do:

def count_dominators(items):
    max_so_far = float('-inf')  # Smaller than any value to it's replaced immediately,
                                # identifying first element as dominator by definition

    count = 0
    for item in reversed(items):  # Iterate in reverse order so we only need to know max
                                  # so far to identify new dominators
        if item > max_so_far:
            max_so_far = item
            count += 1
    return count

If you want more magic to do less of the work yourself, Python's itertools module provides an accumulate function that can repeatedly perform an operation on elements as it goes, yielding the result so far. If you use max as the operation, then it generates a stream of numbers that changes only when you see a new dominator. Convert the result to a set or use itertools.groupby (which is roughly equivalent to the uniq command line utility when run without arguments, making it a cheap way to deduplicate adjacent duplicates) to dedupe, and the length of the set or the number of groups is the dominator count:

from itertools import accumulate, groupby

def count_dominators(items):
    return sum(1 for _, _ in groupby(accumulate(reversed(items), max)))

or at the cost of a technically unnecessary, growing for each new dominator, set (likely faster, as itertools.groupby has more overhead than you might expect, and the number of elements collected after deduplication is likely small enough to be irrelevant):

from itertools import accumulate

def count_dominators(items):
    return len(set(accumulate(reversed(items), max)))

You could replace the sum(1 for _, _ in (and the matching trailing parenthesis) with a more efficient recipe for counting elements in an iterator if you like (though it hardly matters unless the inputs are huge).

I'll note, in quick testing, all the `accumulate`-based solutions run meaningfully slower than the simple loop I gave initially when run against a nastier test case like `range(10**7, 0, -1)`. The simple loop took about 0.8 seconds, `len`+`set` 2.2 seconds, `groupby`+linked `ilen` 3.3 seconds, and `groupby`+`sum(1 for ...` 3.5 seconds. `groupby` was slower than `set` even though the `set` got huge, and all were slower than simple loop. Sometimes, clever doesn't save you anything (I strongly suspect inefficiency in the repeated calls to `max` function being part of the issue here). — ShadowRanger, Jul 28 '22 at 22:04
Thank you for all the detailed information, the accumulate and groupby helped alot. I ran into an issue around argument#8 though where given the list = [123, -492, -38, -28, 113, -15] , the expected response was 3, but it returned 2? — , Jul 29 '22 at 01:46
Which version did you use? I just tested all three of the versions I posted, and all of them returned `3` for that input. — ShadowRanger, Jul 29 '22 at 03:49
I've been using Python through replit, could that be a reason why? — , Jul 29 '22 at 16:34
@AlexWinter: Not unless it was hugely buggy. Seems more likely to be a copy'n'paste error when you put the code there. You can see that all three versions return `3` here: [Try it online!](https://tio.run/##jVDRSsQwEHzvV@zbJdiCaRQ84fwRkRLTRAtNUjaJeF9fs22Ps@qD@7CwO8PM7E7n9B68nGeLwcGQDKYQxgiDmwImUFpnl0eVTA1vGPL0eq6q3ljQvWCF7SJ/rKCUU59dDJ1VCCewY1CJHZrB2wNfYB2yTwW5XSYbkKyKnwc0Hwaj6XdqVINdOU/ftK/oL0/i7tDV8uYEYlmjSRn9ur2c0O5MN0bMjoklYldDRxG3w9n1GexH6pqycM4vwvIv4dF4Fk36lwzZ2@w12bPy65rSUpOb5oSDT4wo7Fm0sobm7lgIjXworS1NCFqK@xfO5/kL "Python 3 – Try It Online") — ShadowRanger, Jul 29 '22 at 18:03
@AlexWinter: My suspicion is you used a version from another answer that initialized the equivalent of `max_so_far` to `-1` (as in griko's answer) or `0`, rather than `float('-inf')` (guaranteed smallest number), which would make it fail to recognize `-15` as the initial dominator. — ShadowRanger, Jul 29 '22 at 18:05

score 1 · Answer 3 · answered Jul 28 '22 at 21:55

Assuming there are only positive numbers, you can run on a list from its end and mark the biggest number you've met. It will be O(n) complexity, instead of O(n^2) that you've suggested. Something like that shall work:

def count_dominators(items):
    count = 0
    last_biggest = -1
    for i in range(len(items)-1,-1,-1):
      if items[i] > last_biggest:
        last_biggest = items[i]
        count += 1

    return count

How can I optimize my Python code to run faster?

3 Answers3