That's actually pretty simple once you really understand what zip()
does.
The zip
function takes several arguments (all of iterable type) and pair items from these iterables according to their respective positions.
For example, say we have two arguments ranked_athletes, rewards
passed to zip
, the function call zip(ranked_athletes, rewards
) will:
- pair athlete that ranked first (position i=0) with the first/best reward (position i=0)
- it will move the the next element, i=1
- pair the 2nd athlete with its reward, the 2nd from
reward
.
- ...
This will be repeated until there is either no more athlete or reward left. For example if we take the 100m at the 2016 olympics and zip
the rewards we have:
ranked_athletes = ["Usain Bolt", "Justin Gatlin", "Andre De Grasse", "Yohan Blake"]
rewards = ["Gold medal", "Silver medal", "Bronze medal"]
zip(ranked_athletes, rewards)
Will return an iterator over the following tuples (pairs):
('Usain Bolt', 'Gold medal')
('Justin Gatlin', 'Silver medal')
('Andre De Grasse', 'Bronze medal')
Notice how Yohan Blake has no reward (because there are no more reward left in the rewards
list).
The *
operator allows to unpack a list, for example the list [1, 2]
unpacks to 1, 2
. It basically transform one object into many (as many as the size of the list). You can read more about this operator(s) here.
So if we combine these two, zip(*x)
actually means: take this list of objects, unpack it to many objects and pair items from all these objects according to their indexes. It only make sense if the objects are iterable (like lists for example) otherwise the notion of index doesn't really make sense.
Here is what it looks like if you do it step by step:
>>> print(x) # x is a list of lists
[[1, 2, 3], ['a', 'b', 'c', 'd']]
>>> print(*x) # unpack x
[1, 2, 3] ['a', 'b', 'c', 'd']
>>> print(list(zip(*x))) # And pair items from the resulting lists
[(1, 'a'), (2, 'b'), (3, 'c')]
Note that in this case, if we call print(list(zip(x)))
we will just pair items from x
(which are 2 lists) with nothing (as there are no other iterable to pair them with):
[ ([1, 2, 3], ), (['a', 'b', 'c', 'd'], )]
^ ^
[1, 2, 3] is paired with nothing |
|
same for the 2nd item from x: ['a', 'b', 'c', 'd']
Another good way to understand how zip
works is by implementing your own version, here is something that will do more or less the same job as zip
but limited to the case of two lists (instead of many iterables):
def zip_two_lists(A, B):
shortest_list_size = min(len(A), len(B))
# We create empty pairs
pairs = [tuple() for _ in range(shortest_list_size)]
# And fill them with items from each iterable
# according to their the items index:
for index in range(shortest_list_size):
pairs[index] = (A[index], B[index])
return pairs
print(zip_two_lists(*x))
# Outputs: [(1, 'a'), (2, 'b'), (3, 'c')]
Notice how I didn't call print(list(zip_two_lists(*x)))
that's because this function unlike the real zip
isn't a generator (a function that constructs an iterator), but instead we create a list in memory. Therefore this function is not as good, you can find a better approximation to the real zip
in Python's documentation. It's often a good idea to read these code equivalences you have all around this documentation, it's a good way to understand what a function does without any ambiguity.