Why python tuple factory function is slower than list factory function?

Question

Although they're almost the same performance But i'm curious because i thought that tuple is much efficient than tuple according to are-tuples-more-efficient-than-lists-in-python Does anyone know?

>>> a = (i for i in range(100000))
>>> timeit.timeit('list(a)', 'from __main__ import a', number=1000)
0.011526490096002817
>>> a = (i for i in range(100000))
>>> timeit.timeit('tuple(a)', 'from __main__ import a', number=1000)
0.009374740999192



>>> a = [i for i in range(100000)]
>>> timeit.timeit('tuple(a)', 'from __main__ import a', number=1000)
0.35291082598268986
>>> timeit.timeit('list(a)', 'from __main__ import a', number=1000)
0.32638651994057


>>> a = {i for i in range(10000)}
>>> timeit.timeit('tuple(i for i in a)', 'from __main__ import a', number=1000)
0.4628257639706135
>>> timeit.timeit('[i for i in a]', 'from __main__ import a', number=1000)
0.20995741098886356
>>> timeit.timeit('list(map(lambda x: x, a))', 'from __main__ import a', number=1000)
0.9662498680409044

>>> timeit.timeit('x = (1,2,3,4,)', number=10000000)
0.13525238999864087
>>> timeit.timeit('x = [1,2,3,4,]', number=10000000)
0.5406758830067702

update

>>> timeit.timeit('tuple([i for i in a])', 'from __main__ import a', number=10000)
27.79521625099005
>>> timeit.timeit('list([i for i in a])', 'from __main__ import a', number=10000)
27.748358012002427
>>> timeit.timeit('x = (1,2,3,4,)', number=10000000)
0.13525238999864087
>>> timeit.timeit('x = [1,2,3,4,]', number=10000000)
0.5406758830067702
>>> timeit.timeit('list([i for i in (1,2,3,4,5)])', 'from __main__ import a', number=1000000)
0.48201177397277206
>>> timeit.timeit('tuple([i for i in (1,2,3,4,5)])', 'from __main__ import a', number=1000000)
0.4545572029892355

My intermediate conclusion: Assumed that you're building a json api service, maybe you should:

use list comprehension not tuple comprehension because there's a function call when using tuple comprehension
when casting generator into array, use list() over tuple()
When declaring array, use (x, x, ) than [x, x, ]

Your test is only testing a single iteration as `a` is consumed upon the first run thus it is inaccurate. — metatoaster, Nov 15 '18 at 03:40
@metatoaster i've updated my post still curious about the result... — 張泰瑋, Nov 15 '18 at 03:45
`1000` times for a simple cast operation is too little and there are actually huge variance between single runs due to CPU scheduling. Your latter two result is within an order of magnitude for this small sample size is not enough, they are basically identical. You should also try a smaller data structure size `timeit.timeit('x = [1,2,3,4,]', number=10000000)` vs `timeit.timeit('x = (1,2,3,4,)', number=10000000)`. — metatoaster, Nov 15 '18 at 03:50
For the earlier case, they are not identical because the first one involves a function call (`tuple` is called) and the second one does not involve a function call but is of a much simpler list-comprehension construct. Try again using `10000` and only compare between `tuple([i for i in a])` and `list([i for i in a])`. — metatoaster, Nov 15 '18 at 03:53
Also one final thing: using list comprehension and then turning that into a tuple is always going to involve constructing a list before turning that into a tuple, hence that's where your overhead comes from. You should have noted that a static construction of a fixed, known length of input (based on that smaller test) on a tuple vs. list is much quicker for the tuple as no lists are involved. — metatoaster, Nov 15 '18 at 03:58
As far as your conclusion goes -- why not use whatever makes your code more *readable* hence better maintainable? Your question seems like a textbook case of premature optimization. — John Coleman, Nov 15 '18 at 15:29
@metatoaster sorry for the late response. You taught me so much After trying smaller data set, finding tuple() would be a little bit faster than list(). But tuple() would be slower than list() when using big data set. Weird, is there something to do with the speed they access the item? I know that tuple would be much slower than list when access its item. — 張泰瑋, Nov 15 '18 at 15:32
@JohnColeman thanks for your recommendation. But there's nothing to do with my question, i just want to know more about python. But still thanks for your advice — 張泰瑋, Nov 15 '18 at 15:37
If you just want to learn, why are you phrasing your conclusion about what one "should" do? — John Coleman, Nov 15 '18 at 15:38
If you are building a JSON API service, stick with `list` as JSON does not have the `tuple` concept. Also, `tuples` are for known, fixed-length data and any time a list-comprehension syntax is used, there is no reason to turn that variable length data into an immutable, fixed length data structure, especially when it will be serialized into some string - casting that into a tuple will involve copying the entire list into a separate data structure. Only use `tuple` when you know _exactly_ how long that data structure is and if it's being constructed statically in the code. — metatoaster, Nov 15 '18 at 23:15
@metatoaster Thank you for giving me good advice. i understand now~ — 張泰瑋, Nov 18 '18 at 04:00
The base answer is the same as for the linked question: literal syntax is faster than calling a built in name. Also, Python keeps a cache of small tuples for fast creation. Try timing `(..., )` tuples with more than 20 elements (IIRC) vs `[..., ]` lists. — Martijn Pieters, Dec 08 '18 at 05:54

Why python tuple factory function is slower than list factory function?

update

0 Answers0