I've found that a solution based on np.argpartition
is the fastest. It does require Numpy though, and it doesn't take into account the conversion of your list to a numpy array. But, in the end, if you wish to do some further number crunching down the line you might want to use numpy arrays since operations on those are typically much faster than operations on list. So I'm assuming this is the case for you.
First you should convert your list to an array:
import numpy as np
v = list(range(10000000))
var = np.array(v)
We can then use np.argpartition
%%time
ind = np.argpartition(var, -2)[-2]
CPU times: user 46.3 ms, sys: 25.9 ms, total: 72.2 ms
Wall time: 71.7 ms
ind = 9999998
, so the result seems correct.
Now if we compare with the other suggestions here:
%%time
v.pop(v.index(max(v)))
max(v)
CPU times: user 619 ms, sys: 1.96 ms, total: 621 ms
Wall time: 623 ms
About 10 times slower
%%time
heapq.nlargest(2,v)[1]
CPU times: user 2.66 s, sys: 1.84 ms, total: 2.66 s
Wall time: 2.66 s
Well... much slower
And the last one
largest, second_largest = v[0], v[0]
for x in v:
if x > largest:
largest, second_largest = x, largest
elif x > second_largest:
second_largest = x
print(largest, second_largest) # 999999 999998
CPU times: user 1.41 s, sys: 0 ns, total: 1.41 s
Wall time: 1.41 s
Again, many times slower.
The np.argpartition
solution seems the fastest if you plan to work with arrays somewhere down the line. If not, you will spend a lot of CPU time converting your list to an array that you won't be using again (see below). It still outperforms some other solutions though.
If we take into account the conversion to an array, we get this result:
%%time
var=np.array(v)
ind = np.argpartition(var, -2)[-2]
CPU times: user 867 ms, sys: 59 ms, total: 926 ms
Wall time: 924 ms
This question explains pretty well why this solution is faster. The reason is that you only perform a partial sort using np.argpartition
rather than fully sorting the list.