37

How large can the input I supply to the input() function be?

Unfortunately, there was no easy way to test it. After using a lot of copy-pasting I couldn't get input to fail on any input I supplied. (and I eventually gave up)

The documentation for the input function doesn't mention anything regarding this:

If the prompt argument is present, it is written to standard output without a trailing newline. The function then reads a line from input, converts it to a string (stripping a trailing newline), and returns that. When EOF is read, EOFError is raised.

So, I'm guessing there is no limit? Does anyone know if there is and, if so, how much is it?

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
user6774416
  • 756
  • 5
  • 18

2 Answers2

34

Of course there is, it can't be limitless*. The key sentence from the documentation that I believe needs highlighting is:

[...] The function then reads a line from input, converts it to a string (stripping a trailing newline) [...]

(emphasis mine)

Since it converts the input you supply into a Python str object it essentially translates to: "Its size has to be less than or equal to the largest string Python can create".

The reason why no explicit size is given is probably because this is an implementation detail. Enforcing a maximum size to all other implementations of Python wouldn't make much sense.

*In CPython, at least, the largest size of a string is bounded by how big its index is allowed to be (see PEP 353). That is, how big the number in the brackets [] is allowed to be when you try and index it:

>>> s = ''
>>> s[2 ** 63]

IndexErrorTraceback (most recent call last)
<ipython-input-10-75e9ac36da20> in <module>()
----> 1 s[2 ** 63]

IndexError: cannot fit 'int' into an index-sized integer

(try the previous with 2 ** 63 - 1, that's the positive acceptable limit, -2 ** 63 is the negative limit.)

For indices, it isn't Python numbers that are internally used; instead, it is a Py_ssize_t which is a signed 32/64 bit int on 32/64 bit machines respectively. So, that's the hard limit from what it seems.

(as the error message states, int and intex-sized integer are two different things)

It also seems like input() explicitly checks if the input supplied is larger than PY_SSIZE_T_MAX (the maximum size of Py_ssize_t) before converting:

if (len > PY_SSIZE_T_MAX) {
    PyErr_SetString(PyExc_OverflowError,
                    "input: input too long");
    result = NULL;
}

Then it converts the input to a Python str with PyUnicode_Decode.


To put that in perspective for you; if the average book is 500.000 characters long and the estimation for the total number of books is around 130 million, you could theoretically input around:

>>> ((2 ** 63) - 1) // 500000 * 130000000
141898

times those characters; it would probably take you some time, though :-) (and you'd be limited by available memory first!)

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
  • 5
    [How long is a \[piece of\] string?](https://en.wiktionary.org/wiki/how_long_is_a_piece_of_string) – wim Nov 14 '16 at 22:09
  • I tried this in an interactive session (linux xterm) and got very odd reactions (after pasting some 30k characters using the middle mouse button), looks like a very slow print of the string (1 line per second). Probably not Python's problem here but slowdown in readline/xterm or do you have another idea what's causing this? – mkiever Nov 14 '16 at 22:35
  • I agree with you, doesn't make sense for it to be Pythons fault. I really doubt interactive sessions where built to smoothly handle a 30k character dump but then again I haven't looked into those @mkiever – Dimitris Fasarakis Hilliard Nov 14 '16 at 22:48
  • 2
    I guess this actually breaks some part of the interactive session. It's still printing and can't be stopped with Ctrl-c, had to be killed. I'll need to check this in more detail one of these days. – mkiever Nov 14 '16 at 22:57
  • 1
    That number is still too large to make any sense to me. If every character of the book would contain the entire google index, then you'd have about 2-4 books. – phihag Nov 15 '16 at 05:58
  • checking with one big book like the bible, a quick google search tell me that it have 3,116,480 letters in 783,137 words and for simplicity lets say that it have as many spaces as words and as many other punctuation marks as words too, so it have like 4,682,754 characters. Then you can input ((2**63)-1)//4682754 = 1,969,646,929,318 bibles... – Copperfield Nov 17 '16 at 00:05
16

We can find the answer experimentally quite easily. Make two files:

make_lines.py:

num_lines = 34

if __name__ == '__main__':
    for i in range(num_lines):
        print('a' * (2 ** i))

read_input.py:

from make_lines import num_lines

for i in range(num_lines):
    print(len(input()))

Then run this command in Linux or OSX (I don't know the Windows equivalent):

python make_lines.py | python3 read_input.py

On my computer it manages to finish but struggles by the end, slowing down other processes significantly. The last thing it prints is 8589934592, i.e. 8 GiB. You can find out the value for yourself according to your definition of what's acceptable in terms of time and memory limits.

Alex Hall
  • 34,833
  • 5
  • 57
  • 89