29

When I try to use print without parentheses on a simple name in Python 3.4 I get:

>>> print max
Traceback (most recent call last):
  ...
  File "<interactive input>", line 1
    print max
            ^
SyntaxError: Missing parentheses in call to 'print'

Ok, now I get it, I just forgot to port my Python 2 code.

But now when I try to print the result of a function:

>>> print max([1,2])
Traceback (most recent call last):
    ...
    print max([1,2])
            ^
SyntaxError: invalid syntax

Or:

print max.__call__(23)
        ^
SyntaxError: invalid syntax

(Note that the cursor is pointing to the character before the first dot in that case.)

The message is different (and slightly misleading, since the marker is below the max function).

Why isn't Python able to detect the problem earlier?

Note: This question was inspired by the confusion around this question: Pandas read.csv syntax error, where a few Python experts missed the real issue because of the misleading error message.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • I noticed something similar before with string formatting e.g. `print '{}'.format('hi')` – Chris_Rands Jan 10 '18 at 09:52
  • 1
    Interesting that the "missing parentheses" message is special-cased in a not-quite-special-enough way. – kindall Jan 10 '18 at 09:56
  • I wonder if you could collect some data on how varied it is each execution... I have a feeling it's python's way of reading in class identifiers. A very interesting question though. – Jack Nicholson Jan 10 '18 at 09:58
  • 3
    @kindall I suspect that you're right: the "missing parentheses" is a kludge to make sure users understand in most basic cases, but it's unable to work on every case because of the generic python 3 parsing (which is also the reason why `print` was converted to a function) – Jean-François Fabre Jan 10 '18 at 10:02
  • @usr2564301 yes, edited & fixed – Jean-François Fabre Jan 10 '18 at 10:10
  • 4
    I think this error message is produced in some simple cases only as help for the user. `def x(): print max` (single line) for example does not produce the missing parenthesis hint. and `lambda x: print x` seems to have a bug and print `Did you mean print(x: print x)` which isn't even valid python. – mata Jan 10 '18 at 10:19
  • `exceptions.c` has a comment ".. The error message can be a bit odd in cases where the "arguments" are * completely illegal syntactically, but that isn't worth the hassle of * fixing." for the function `_set_legacy_print_statement_msg`, which prints the only occurrence of that particular error. – Jongware Jan 10 '18 at 10:26
  • (Discussed in https://bugs.python.org/issue21669. Guido van Rossum: "Don't change the grammar, just hack the heck out of the error message.") – Jongware Jan 10 '18 at 10:29
  • 1
    Related: https://stackoverflow.com/a/25445440/2564301 – Jongware Jan 10 '18 at 10:32

4 Answers4

28

Looking at the source code for exceptions.c, right above _set_legacy_print_statement_msg there's this nice block comment:

/* To help with migration from Python 2, SyntaxError.__init__ applies some
 * heuristics to try to report a more meaningful exception when print and
 * exec are used like statements.
 *
 * The heuristics are currently expected to detect the following cases:
 *   - top level statement
 *   - statement in a nested suite
 *   - trailing section of a one line complex statement
 *
 * They're currently known not to trigger:
 *   - after a semi-colon
 *
 * The error message can be a bit odd in cases where the "arguments" are
 * completely illegal syntactically, but that isn't worth the hassle of
 * fixing.
 *
 * We also can't do anything about cases that are legal Python 3 syntax
 * but mean something entirely different from what they did in Python 2
 * (omitting the arguments entirely, printing items preceded by a unary plus
 * or minus, using the stream redirection syntax).
 */

So there's some interesting info. In addition, in the SyntaxError_init method in the same file, we can see

    /*
     * Issue #21669: Custom error for 'print' & 'exec' as statements
     *
     * Only applies to SyntaxError instances, not to subclasses such
     * as TabError or IndentationError (see issue #31161)
     */
    if ((PyObject*)Py_TYPE(self) == PyExc_SyntaxError &&
            self->text && PyUnicode_Check(self->text) &&
            _report_missing_parentheses(self) < 0) {
        return -1;
    }

Note also that the above references issue #21669 on the python bugtracker with some discussion between the author and Guido about how to go about this. So we follow the rabbit (that is, _report_missing_parentheses) which is at the very bottom of the file, and see...

legacy_check_result = _check_for_legacy_statements(self, 0);

However, there are some cases where this is bypassed and the normal SyntaxError message is printed, see MSeifert's answer for more about that. If we go one function up to _check_for_legacy_statements we finally see the actual check for legacy print statements.

/* Check for legacy print statements */
if (print_prefix == NULL) {
    print_prefix = PyUnicode_InternFromString("print ");
    if (print_prefix == NULL) {
        return -1;
    }
}
if (PyUnicode_Tailmatch(self->text, print_prefix,
                        start, text_len, -1)) {

    return _set_legacy_print_statement_msg(self, start);
}

So, to answer the question: "Why isn't Python able to detect the problem earlier?", I would say the problem with parentheses isn't what is detected; it is actually parsed after the syntax error. It's a syntax error the whole time, but the actual minor piece about parentheses is caught afterwards just to give an additional hint.

MSeifert
  • 145,886
  • 38
  • 333
  • 352
alkasm
  • 22,094
  • 5
  • 78
  • 94
  • 2
    But it doesn't even go in the `_check_for_legacy_statements` in the mentioned case because an even earlier check in `_report_missing_parentheses` checks for any opening parens and if it finds any it already returns from that function (See also my answer). – MSeifert Jan 10 '18 at 10:46
17

The special exception message for print used as statement instead of as function is actually implemented as a special case.

Roughly speaking when a SyntaxError is created it calls a special function that checks for a print statement based on the line the exception refers to.

However, the first test in this function (the one responsible for the "Missing parenthesis" error message) is if there is any opening parenthesis in the line. I copied the source code for that function (CPython 3.6.4) and I marked the relevant lines with "arrows":

static int
_report_missing_parentheses(PySyntaxErrorObject *self)
{
    Py_UCS4 left_paren = 40;
    Py_ssize_t left_paren_index;
    Py_ssize_t text_len = PyUnicode_GET_LENGTH(self->text);
    int legacy_check_result = 0;

    /* Skip entirely if there is an opening parenthesis <---------------------------- */
    left_paren_index = PyUnicode_FindChar(self->text, left_paren,
                                          0, text_len, 1);
    if (left_paren_index < -1) {
        return -1;
    }
    if (left_paren_index != -1) {
        /* Use default error message for any line with an opening parenthesis <------------ */
        return 0;
    }
    /* Handle the simple statement case */
    legacy_check_result = _check_for_legacy_statements(self, 0);
    if (legacy_check_result < 0) {
        return -1;

    }
    if (legacy_check_result == 0) {
        /* Handle the one-line complex statement case */
        Py_UCS4 colon = 58;
        Py_ssize_t colon_index;
        colon_index = PyUnicode_FindChar(self->text, colon,
                                         0, text_len, 1);
        if (colon_index < -1) {
            return -1;
        }
        if (colon_index >= 0 && colon_index < text_len) {
            /* Check again, starting from just after the colon */
            if (_check_for_legacy_statements(self, colon_index+1) < 0) {
                return -1;
            }
        }
    }
    return 0;
}

That means it won't trigger the "Missing parenthesis" message if there is any opening parenthesis in the line. That leads to the general SyntaxError message even if the opening parenthesis is in a comment:

print 10  # what(
    print 10  # what(
           ^
SyntaxError: invalid syntax

Note that the cursor position for two names/variables separated by a white space is always the end of the second name:

>>> 10 100
    10 100
         ^
SyntaxError: invalid syntax

>>> name1 name2
    name1 name2
              ^
SyntaxError: invalid syntax

>>> name1 name2([1, 2])
    name1 name2([1, 2])
              ^
SyntaxError: invalid syntax

So it is no wonder the cursor points to the x of max, because it's the last character in the second name. Everything that follows the second name (like ., (, [, ...) is ignored, because Python already found a SyntaxError, and it doesn't need to go further, because nothing could make it valid syntax.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
MSeifert
  • 145,886
  • 38
  • 333
  • 352
5

Maybe I'm not understanding something, but I don't see why Python should point out the error earlier. print is a regular function, that is a variable referencing a function, so these are all valid statements:

print(10)
print, max, 2
str(print)
print.__doc__
[print] + ['a', 'b']
{print: 2}

As I understand it, the parser needs to read the next full token after print (max in this case) in order to determine whether there is a syntax error. It cannot just say "fail if there is no open parenthesis", because there are a number of different tokens that may go after print depending on the current context.

I don't think there is a case where print may be directly followed by another identifier or a literal, so you could argue that as soon as there is one letter, a number or quotes you should stop, but that would be mixing the parser's and the lexer's job.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
jdehesa
  • 58,456
  • 7
  • 77
  • 121
5

in additions to those excellent answers, without even looking at the source code, we could have guessed that the print special error message was a kludge:

so:

print dfjdkf
           ^
SyntaxError: Missing parentheses in call to 'print'

but:

>>> a = print
>>> a dsds
Traceback (most recent call last):
  File "<interactive input>", line 1
    a dsds
         ^
SyntaxError: invalid syntax

even if a == print but at that stage, it isn't evaluated yet, so you get the generic invalid syntax message instead of the hacked print syntax message, which proves that there's a kludge when the first token is print.

another proof if needed:

>>> print = None
>>> print a
Traceback (most recent call last):
  File "C:\Python34\lib\code.py", line 63, in runsource
    print a
          ^
SyntaxError: Missing parentheses in call to 'print'

in that case print == None, but the specific message still appears.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219