2

I have been slightly confused by the way a common python function available called raw_input operates.

I don't appear to have any restrictions to input however many chars I want here . The function help also does not appear to ask for a maximum number of characters as an argument as shown below (it only allows user to enter a prompt message).

raw_input(...)
    raw_input([prompt]) -> string

    Read a string from standard input.  The trailing newline is stripped. If the user hits EOF (Unix: Ctl-D, Windows: Ctl-Z+Return), raise EOFError.On Unix, GNU readline is used if enabled.  The prompt string, if given, is printed without a trailing newline before reading.

How does Python stop a buffer overflow attack or any attempt to consume excessive memory in a scenario where data is read in from user as a string - which is basically an array of chars - as shown below???

>> r=raw_input("enter something:")
enter something: dfjdfldfkdflkjdflkdjflkjfdlfdjklfdkjfdlkjfdlkfjdlkdfjlfdj..... 
>> print r
dfjdfldfkdflkjdflkdjflkjfdlfdjklfdkjfdlkjfdlkfjdlkdfjlfdj.....

Thanks and Kind Regards,

John

Community
  • 1
  • 1
John
  • 768
  • 1
  • 13
  • 21
  • 1
    Please upgrade, btw, the equivalent in contemporary Python is called `input()` then. Don't bother learning an obsolete language. – Ulrich Eckhardt Oct 04 '15 at 12:34
  • @UlrichEckhardt yes forget python 2.7, because just like C it's an obsolete language and there's no use to learn it yet alone use it; and like for C all these crazy predictions came true .... C# for everyone!!! – user1514631 Oct 04 '15 at 12:37
  • 1
    The OP's question applies to Python 3 `input()`; let's not get sidelined into a Python 2 vs Python 3 argument. – PM 2Ring Oct 04 '15 at 12:40
  • C#, C++ and D are derived languages, not a new major version that obsoletes the one before. Considering that it's stable, widely available and has an active support base, there's really no reason to learn Python 2 any more. Even more so, if you're not doing it to maintain legacy systems built on top of Python 2 but trying to learn programming. – Ulrich Eckhardt Oct 04 '15 at 12:49
  • @UlrichEckhardt, I agree, but the question itself is generic enough to be applied for many python versions and even for different languages altogether. Recommending directly to upgrade and forget about the problem is nonconstructive. Furthermore as you mentioned there can be valid reasons to stick with python 2 and the OP didn't explicitly mention why he's asking the question. – user1514631 Oct 04 '15 at 12:54

2 Answers2

2

The buffer overflow attack is a different topic and it doesn't apply here as long as the implementation of raw_input is correct (meaning it's not writing beyond the buffer that it has allocated for storing the input). Let's assume the implementation of input_raw is safe.

Like many structures in python raw_input will store its input in a dynamically allocated and dynamically increasing buffer. The initially allocated buffer for storing the input is normally small (perhaps a few dozen elements) and as you keep filling up the buffer it keeps getting extended (reallocated with a larger size to accommodate even more elements).

There is for sure a hard limit due to the OS, hardware limitations and because of the implementation itself. For a 32bit platform running a 32bit python the limit is most likely 2**32-1 (4 Gibibytes or at least 2).

In worst case python could exhaust system memory if there are no per process limits enforced by the OS. But even so on Linux for example the oom handler will kill the process with the highest memory use, which could be exactly the python process which is misbehaving (but it could be also another legit process).

user1514631
  • 1,183
  • 1
  • 9
  • 14
2

The hard limit of Python string length can be found in sys.maxsize:

The largest positive integer supported by the platform’s Py_ssize_t type, and thus the maximum size lists, strings, dicts, and many other containers can have.

On a 32 bit system, sys.maxsize is 2147483647, i.e., 2³¹-1. Of course, memory limitations may apply before you reach that size.

If Python cannot create an object due to insufficient memory, then a MemoryError exception is raised. If you have sufficient memory but you attempt to exceed sys.maxsize then OverflowError is raised.

If you can read C you may be interested in looking at the source code for raw_input &/or Python 3 input, both of which are linked in Where is raw_input implemented in the cpython source code?. However, raw_input (and Python 3 input) calls the readline() function from the GNU Readline library, when available, so you'd need to delve into that to fully answer your question.

FWIW, if your Python script that takes console input is running on a Unix-like system it's a nice idea to import readline to make Readline's editing facilities available when entering data at the raw_input / input prompt.


I suppose I ought to mention that Python 2 also supplies a function named input(), which is essentially eval(raw_input()). This function is potentially dangerous, and should generally be avoided.

Community
  • 1
  • 1
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182