Confused about execution of two codes in python?

Question

Code 1:

%%timeit
students = [['Zack',38],['Harry', 37.21], ['Berry', 37.21], ['Tina', 37.2], ['Akriti', 41], ['Harsh', 39]]
second_highest = sorted(list(set([x[1] for x in students])))[1]
([a for a,b in sorted(students) if b == second_highest])

Code 2:

%%timeit
students = [['Zack',38],['Harry', 37.21], ['Berry', 37.21], ['Tina', 37.2], ['Akriti', 41], ['Harsh', 39]]
s = sorted(set([x[1] for x in students]))
for name in sorted(x[0] for x in students if x[1] == s[1]):
    name

Now I am confused about execution of two programs, how is code2 is faster than code code1, despite use of nested for loop in code2. Image below is from Jupyter notebook which show mean time taken by code from 100000 loops. Although the difference is very minute but I am confused because how can nested for-loop work faster than single for loop.

I was supposed to Print the output, so can put print before the last line of code

What's the last line in each loop supposed to do? It doesn't really do anything. The differnec is liekly from the extra call to the `list` function in the second line. — Pedro von Hertwig Batista, Dec 30 '17 at 03:19
I was supposed to Print the output, so can put print before the last line of code — Priyansh, Dec 30 '17 at 03:22

score 0 · Answer 1 · answered Dec 30 '17 at 03:20

0

It's not a nested loop.

It seems you are mistaken about this line

for name in sorted(x[0] for x in students if x[1] == s[1]):
    name

Here, you are creating a generator and passing it to the sorted function

sorted(x[0] for x in students if x[1] == s[1])

That computes a list first. Then, after evaluating that expression and obtaining the list, it is iterated by the for loop. So it's two loops, one after the other, not nested.

answered Dec 30 '17 at 03:20

Andy Carlson

3,633
24
43

Thanks, Andy but still with larger input size time taken by Code1 increase exponentially and on the same hand Code2 increases linearly. Can you give me a reason why Code2 is faster and better than Code1. – Priyansh Dec 30 '17 at 03:33

SigmaPiEpsilon · Answer 2 · 2017-12-30T04:45:36.110

This is due to the generator being passed to sorted() in the for loop for the second case. The sorted() function makes a copy of the sequence first and this copy will be slower for generator to list than for list to list because a list needs to be made from the generator first. See this question. So if you replace by the generator by a list it becomes much faster.

In [11]: %%timeit
students = [['Zack',38],['Harry', 37.21], ['Berry', 37.21], ['Tina', 37.2], ['Akriti', 41], ['Harsh', 39]]
s = sorted(set([x[1] for x in students]))
for name in sorted([x[0] for x in students if x[1] == s[1]]):
    name
   ....: 
The slowest run took 5.71 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.72 µs per loop

In [6]: %%timeit                         
students = [['Zack',38],['Harry', 37.21], ['Berry', 37.21], ['Tina', 37.2], ['Akriti', 41], ['Harsh', 39]]
second_highest = sorted(list(set([x[1] for x in students])))[1]
([a for a,b in sorted(students) if b == second_highest])
   ...: 
The slowest run took 11.50 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.31 µs per loop

Confused about execution of two codes in python?

2 Answers2