itertools.chain() method and the asterisk operator

Question

I'm trying to better frame little variations over itertools.chain(). Let's say we have two lists:

l1 = [1,2,3]

l2 = [4,5,6,7,8,'a','b','c']

By simply doing

list(itertools.chain(l1,l2))
Out[464]: [1, 2, 3, 4, 5, 6, 7, 8, 'a', 'b', 'c']

it will merge the two iterators into a single one. What I've often seen people doing is also

list(itertools.chain(*(l1,l2)))
Out[465]: [1, 2, 3, 4, 5, 6, 7, 8, 'a', 'b', 'c']

which returns the same result. Which exactly is the role of asterisk * on this one?

Lastly, if I run the same code but without the asterisk (but with double parentheses), then I get

list(itertools.chain((l1,l2)))
Out[466]: [[1, 2, 3], [4, 5, 6, 7, 8, 'a', 'b', 'c']]

So the original two lists are kept separated and put into a bigger one. What differences are there between these options?

EDIT:

I'm adding details to my question by inserting the specific usage I was referring to (cross-validation):

#k=0,1,2.
for k in range(0, n_folds):
    
    print('---------------------- Fold [{}/{}] ----------------------'.format(k + 1, n_folds))
    
    # load training dataset
    dataset_train = list(itertools.chain(*(k_folds[:k] + k_folds[k + 1:])))

k_folds here is just a list containing three separate folds of data.

What's the point here in using or not the asterisk? In light of the comment of @wjandrea, I don't see any here.

What it results is this:

itertools.chain(k_folds + k_folds[1]) == itertools.chain(*(k_folds + k_folds[1]))
Out[646]: False

So clearly not the same in this context but I don't know why...

Is this an actual usage you've seen? Cause if you have two separate named objects, I don't see any reason to use an asterisk. But if you only have one named object containing two iterables, that's when you'd normally use the asterisk. For example, `t = (range(5), range(6)); chain(*t)` — wjandrea, Apr 21 '22 at 17:46
Note, every time you do `itertools.chain(*whatever)` you probably just want to do `itertools.chain.from_iterable(whatever)` — juanpa.arrivillaga, Apr 21 '22 at 19:58
@juanpa.arrivillaga: And in those cases, you still wouldn't want to do `itertools.chain.from_iterable((l1, l2))`, because that's just a verbose way to get the same effect as `itertools.chain(l1, l2)` (`itertools.chain(l1, l2)` is internally receiving a `tuple` of the arguments anyway, and converting it to an iterator that the `chain` object consumes, in `itertools.chain.from_iterable((l1, l2))`, you manually make the `tuple`, then `chain` still converts it to an iterator and you end up with the same result, slightly slower, due to more work being done at Python level, but equivalent). — ShadowRanger, Apr 21 '22 at 20:03
Regarding the edit, that's a different usage. Those parentheses don't create a tuple, they're just for grouping. First it evaluates the `+` operation then the `*` unpacks the resulting list. It's probably being used because until Python 3.5, you could only have one iterable unpacking per expression, so `chain(*k_folds[:k], *k_folds[k + 1:])` would be invalid. — wjandrea, Apr 22 '22 at 17:49
Also, `chain(x) == chain(x)` will never be true because they're compared by identity, not value, since they're iterators so they don't have a value per se. You'd want to do `list(chain(x)) == list(chain(x))`. But again, the parentheses don't create a tuple there; you're actually comparing `x` with `*x`, which are obviously different. — wjandrea, Apr 22 '22 at 17:53

itertools.chain() method and the asterisk operator

0 Answers0