64

This code:

a = [1, 2, 3]
print(*a, a.pop(0))

Python 3.8 prints 2 3 1 (does the pop before unpacking).
Python 3.9 prints 1 2 3 1 (does the pop after unpacking).

What caused the change? I didn't find it in the changelog.

Edit: Not just in function calls but also for example in a list display:

a = [1, 2, 3]
b = [*a, a.pop(0)]
print(b)

Prints [2, 3, 1] vs [1, 2, 3, 1]. And Expression lists says "The expressions are evaluated from left to right" (that's the link to Python 3.8 documentation), so I'd expect the unpacking expression to happen first.

Kelly Bundy
  • 23,480
  • 7
  • 29
  • 65
  • 9
    Was this ever defined or guaranteed behavior in the first place…? – deceze Dec 18 '21 at 15:15
  • 4
    @deceze A language should give consistent results if there are no changes made to it. – Elder Yeager Dec 18 '21 at 15:19
  • 16
    @ElderYeager: not necessarily. if this was undefined, you should have never used it in the first place – blue_note Dec 18 '21 at 15:20
  • 1
    @deceze I'd say stuff like that is usually clearly defined in Python, and I just added something that I think does define it. – Kelly Bundy Dec 18 '21 at 15:28
  • I strongly suspect that this change was unintended. – user4815162342 Dec 18 '21 at 15:28
  • 2
    Somewhat related [Is Python's order of evaluation of function arguments and operands deterministic (+ where is it documented)?](https://stackoverflow.com/q/46288616/4046632) - Note the second comment about bugs by @wim – buran Dec 18 '21 at 15:29
  • The thing with "The expressions are evaluated from left to right" is that `*a` isn't an expression. `a` is one of the expressions, but the `*` is part of the list display syntax, just like the commas and brackets. – user2357112 Dec 18 '21 at 15:33
  • 1
    @user2357112supportsMonica Hmm, I guess so. Although `*a` as a whole is a `starred_item`, and a `starred_item` can be a `starred_expression`. Would have to think more about how to judge that, but I'd say in any case, I would've expected the usual left-to-right evaluation there as well. – Kelly Bundy Dec 18 '21 at 15:41
  • 1
    Does Python have a notion of "undefined behavior"? I was under the impression that Python did not have this, and that, in fact, many Python programmers claimed this as one of its "advantages" over languages like C. – Cody Gray - on strike Dec 19 '21 at 11:25
  • 2
    @CodyGray I'd say it depends on what one means with "undefined behavior", but maybe stuff like the order of `set` elements, much of the stuff that the documentation calls "CPython implementation detail", or stuff like what [searching "undefined"](https://docs.python.org/3/search.html?q=undefined&check_keywords=yes&area=default) finds. – Kelly Bundy Dec 19 '21 at 11:36
  • 1
    Either way is consistent with left-to-right evaluation; the issue is lazy vs. eager evaluation. The 3.8 behaviour evaluates `a` as a reference to a list and then evaluates `a.pop()`, while the 3.9 behaviour evaluates `a` as a reference to a list, *unpacks the list*, and then evaluates `a.pop()`. Either way the expressions are evaluated left to right, because the expression `a` is evaluated before the expression `a.pop()`. The unpacking is evaluated in a different order, but unpacking (i.e. `*a`) isn't an expression. That said, I agree that this *should* have been defined in the language spec. – kaya3 Dec 19 '21 at 11:46
  • 2
    @CodyGray There's nothing special about Python that precludes undefined behavior. The difference is that CPython is the reference implementation, and in many cases there is no definition for how something should work other than "what CPython does". – chepner Dec 19 '21 at 18:09

1 Answers1

53

I suspect this may have been an accident, though I prefer the new behavior.

The new behavior is a consequence of a change to how the bytecode for * arguments works. The change is in the changelog under Python 3.9.0 alpha 3:

bpo-39320: Replace four complex bytecodes for building sequences with three simpler ones.

The following four bytecodes have been removed:

  • BUILD_LIST_UNPACK
  • BUILD_TUPLE_UNPACK
  • BUILD_SET_UNPACK
  • BUILD_TUPLE_UNPACK_WITH_CALL

The following three bytecodes have been added:

  • LIST_TO_TUPLE
  • LIST_EXTEND
  • SET_UPDATE

On Python 3.8, the bytecode for f(*a, a.pop()) looks like this:

  1           0 LOAD_NAME                0 (f)
              2 LOAD_NAME                1 (a)
              4 LOAD_NAME                1 (a)
              6 LOAD_METHOD              2 (pop)
              8 CALL_METHOD              0
             10 BUILD_TUPLE              1
             12 BUILD_TUPLE_UNPACK_WITH_CALL     2
             14 CALL_FUNCTION_EX         0
             16 RETURN_VALUE

while on 3.9, it looks like this:

  1           0 LOAD_NAME                0 (f)
              2 BUILD_LIST               0
              4 LOAD_NAME                1 (a)
              6 LIST_EXTEND              1
              8 LOAD_NAME                1 (a)
             10 LOAD_METHOD              2 (pop)
             12 CALL_METHOD              0
             14 LIST_APPEND              1
             16 LIST_TO_TUPLE
             18 CALL_FUNCTION_EX         0
             20 RETURN_VALUE

In the old bytecode, the code pushes a and (a.pop(),) onto the stack, then unpacks those two iterables into a tuple. In the new bytecode, the code pushes a list onto the stack, then does l.extend(a) and l.append(a.pop()), then calls tuple(l).

This change has the effect of shifting the unpacking of a to before the pop call, but this doesn't seem to have been deliberate. Looking at bpo-39320, the intent was to simplify the bytecode instructions, not to change the behavior, and the bpo thread has no discussion of behavior changes.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
user2357112
  • 260,549
  • 28
  • 431
  • 505
  • The [answers to the linked Q](https://stackoverflow.com/a/46288639/6372809) state the order is left to right, and give also the example `expr1(expr2, expr3, *expr4, **expr5)`. So that doesn't seem to be correct for versions up to Python 3.8, right? What happens in 3.9 if the two are in opposite order, `print(a.pop(0), *a)`? – ilkkachu Dec 19 '21 at 00:08
  • 1
    @ilkkachu: The *expressions* are evaluated left to right, but `*expr4` is not an expression. `expr4` is. The `*` is part of the function call syntax, and the evaluation order documentation doesn't make any promises about when the unpacking happens. – user2357112 Dec 19 '21 at 00:41
  • 2
    As for `print(a.pop(0), *a)`, the `pop` happens before the unpacking, regardless of version. – user2357112 Dec 19 '21 at 00:42
  • @user2357112supportsMonica _"`*a` isn't an expression."_ - I'm not sure that argument holds water. If I enter a bare `*a` in Python 3.9, I get `SyntaxError: can't use starred expression here`, so clearly Python thinks it's some sort of expression. The fact it's not tolerated in every context doesn't change that. – marcelm Dec 19 '21 at 10:31
  • 3
    @marcelm The term "starred expression" could just mean an expression which is starred; it doesn't imply that the result of starring an expression is another expression. Compare e.g. the term "truncated word", which doesn't necessarily imply that something like `flowe` (a truncation of the word `flower`) is itself a word. – kaya3 Dec 19 '21 at 11:49
  • @marcelm, well, yeah, it actually makes sense that it isn't (a normal) expression. It's special in that it produces multiple values but isn't a list or a tuple or whatever in itself. Though I also think it makes sense that the left-to-right order should also include unpacking, or whatever else is required to produce the actual arguments that get passed. – ilkkachu Dec 19 '21 at 12:10