How to make a tuple instead of a list in a comprehension?

Question

I have a list comprehension and want to store the results in a set. But, the list is unhashable and therefore can't be stored in a set.

Is there some way to do a tuple comprehension instead?

You came to the answer at the same second you asked the question. You are really genius! — Prophet, Aug 29 '21 at 19:27
@Prophet Somehow you make it sound like that's a bad thing to do... — no comment, Aug 29 '21 at 19:35
It's weird how the question this is supposedly a duplicate of never showed up when I was looking for an answer. — Joshua Snider, Aug 29 '21 at 19:40
@don'ttalkjustcode I feel this like a unfair way to gain the points. But maybe I'm wrong. — Prophet, Aug 29 '21 at 19:43
@Prophet: I'm not trying to game the reputation system or anything. I just had the question written up when I found the answer, so figured I might as well post the question and answer instead of just abandoning the question. — Joshua Snider, Aug 29 '21 at 19:45
If so I'm sorry, but you posted the question with the answer, not answered that yourself later. Anyway, sorry again — Prophet, Aug 29 '21 at 19:49
@Joshua: It didn't show up when you searched because humans are (currently still) better at identifying duplicates than computers. It's also depends on your own search skills, of course. — martineau, Aug 29 '21 at 19:51
@Prophet If submitting an answer along with your question were in any way frowned upon, why do you think that that functionality even exists? [Check this out](https://stackoverflow.com/help/self-answer). — no comment, Aug 29 '21 at 20:12
Again, self answer is OK, when it comes after several hours or something like this, but when it comes instantly together this is not really a question at all from the begging — Prophet, Aug 29 '21 at 20:32
@Prophet, I mean if someone else wanted to answer in their own way or improve my answer, I wouldn't mind. — Joshua Snider, Aug 29 '21 at 20:53
@Prophet that's *totally fine*. Creating a question just to answer it instantly is OK. — juanpa.arrivillaga, Aug 29 '21 at 20:57
@juanpa.arrivillaga Ok, thanks for approving this. I beg a pardon, again — Prophet, Aug 29 '21 at 20:58
@juanpa.arrivillaga: It's OK to answer your own question with working code, which is not the case here, and if nothing else, indicates the OP didn't bother to test it. — martineau, Aug 30 '21 at 00:12
@Joshua: Good, at least it's working code now…although it's still not what I'd call a good answer if you consider what's in the duplicate question coupled with the fact that Python *does* have explicit set comprehensions: i.e. `wave = {tuple((book, 0) for book in srces) for srces in itertools.combinations(games, self.size)}`. — martineau, Aug 30 '21 at 00:59

Joshua Snider · Answer 1 · 2021-08-30T00:32:25.333

1

I came up with the answer in the process of asking the question, so I figured I'd post it with my answer and help the next person to search for a solution. I couldn't find anything about a tuple comprehension, but you can just cast a list to a tuple and store that in the set. Like so:

    wave = set()
    for srces in itertools.combinations(games, self.size):
        wave.add(tuple([(book, 0) for book in srces]))

edited Aug 30 '21 at 00:32

answered Aug 29 '21 at 19:25

Joshua Snider

705
1
8
34

2

There's no need to have the ```[``` ```]```. Without them, you'll create a generator expression, which ```tuple()``` will convert to a tuple. – sj95126 Aug 29 '21 at 19:27
In this particular example, the *far* better solution is to annotate the books not in the combinations but already in the games. Much faster, much less memory, and less code. Btw you can't `append` to a set. – no comment Aug 29 '21 at 19:41
@sj95126 I'm curious about the efficiency. I suspect the generator way will be faster, but don't know about memory usage. Do you? – no comment Aug 29 '21 at 19:42
Code doesn't work: `AttributeError: 'set' object has no attribute 'append'`. – martineau Aug 29 '21 at 19:55
@don'ttalkjustcode list comprehension is probably faster, and will be until we start talking about pretty large lists – juanpa.arrivillaga Aug 29 '21 at 20:19
@juanpa.arrivillaga Hmm, why would large lists favor the generator expression way? And do you have an idea about how pretty large? – no comment Aug 29 '21 at 20:25
@don'ttalkjustcode Because a list comprehension has to create a list, *always*, and then that list is passed to the `tuple` constructor. With a generator, you are only doing "one pass" on the data. However, iterating over a generator is slower, and creating lists and tuples from lists is quite fast in Python – juanpa.arrivillaga Aug 29 '21 at 20:26
@juanpa.arrivillaga But how does size matter? Don't both ways take linear time with some little constant overhead? And I'm guessing that that little overhead doesn't matter except for rather small lists. – no comment Aug 29 '21 at 20:32
@don'ttalkjustcode yes it does take linear time, and no, the constant factors *do matter*. A lot. – juanpa.arrivillaga Aug 29 '21 at 20:33
@juanpa.arrivillaga Constant overhead refers to constant *summands*, not to constant *factors*. – no comment Aug 29 '21 at 20:33
@don'ttalkjustcode but it *is* a constant factor, because the "overhead" is *per iteration*. Just look at these [profiling results](https://gist.github.com/juanarrivillaga/d5a0c2e0d44ad27171bc861d789aaae7) up to an N of 500,000, the list comprehension was still beating the generator expression, even though it does two passes over the data. – juanpa.arrivillaga Aug 29 '21 at 20:55
@juanpa.arrivillaga You're talking about something different. I meant the constant overhead of *initializing* the list comp, generator iterator, etc. Overhead *over the linear part* of the overall time complexity. Let's say the list way takes time `an+b` and the generator way takes `cn+d`. I meant `b` and `d`. And I think they quickly become irrelevant. So we have `an` vs `cn`. Now if those constant factors `a` and `c` are indeed constant, then the way with the smaller factor is *always* faster. That's also what your profiling shows. You're confirming *my* point there. There is no switch. – no comment Aug 29 '21 at 21:08
@don'ttalkjustcode sorry, I'm not being clear, my point about size is that eventually you hit swap. If the generator expression version requires N memory, the list comprehension version requires 2*N Memory. For *very large* tuples, there will be a point where the generator expressions memory efficiency will make it faster, that is the only point I was making about that. The small, constant overhead regarding generator comprehensions/list comprehensions are about the same anyway, I wouldn't expect them to be very significant either way – juanpa.arrivillaga Aug 29 '21 at 21:10
@juanpa.arrivillaga Ok, that's finally an argument for a speed switch :-). It's also what I suspected, and I've been searching for the source code to see what `tuple` does with a generator (haven't found it yet, do you know where it is?). – no comment Aug 29 '21 at 21:19
1

@juanpa.arrivillaga Found the code I was looking for, [PySequence_Tuple in abstract.c](https://github.com/python/cpython/blob/ce5e1a6809b714eb0383219190a076d9f883e008/Objects/abstract.c#L2045). And did some tracemalloc experiments. Seems `tuple(listcomp)` takes up to 2.125 N memory (1.125 for listcomp, then 1 for tuple) and `tuple(generator)` takes up to 1.25 N memory. The extra 12.5% and 25% for overallocating. So neither list nor tuple appear to grow by allocating a new place and moving the data, I guess the OS happily extends the existing allocation... – no comment Aug 30 '21 at 16:56

How to make a tuple instead of a list in a comprehension?

1 Answers1