1

I am trying to generate a PDF in Windows using a HTML-PDF Web Service in Python 2.x. This link Python 2.x - Write binary output to stdout? says that I need to modify the binary file if I am writing it to stdout.

def generate_pdf():
    pdf = callservice(html)
    if pdf is not None and sys.platform == "win32":
        import os, msvcrt
        msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
    return pdf

def process():
    pdf = generate_pdf()
    # This comes as IOError Errno 12 not enough space

E:\ Drive where this program runs has 10 GB Available. Does anyone know what could be happening? C:\ Drive also has 10 GB Available. Should we check into the source code of msvcrt to see what is happening. I am trying to check that.

Community
  • 1
  • 1
Nishant
  • 20,354
  • 18
  • 69
  • 101
  • 1
    Are you writing the PDF to stdout, which has been redirected to a file on either drive `E:` or drive `C:`? – Eryk Sun Nov 16 '16 at 16:05
  • sys.stdout.fileno() is just 1, I run this script from a Shell as python file.py, does that answer your question? – Nishant Nov 16 '16 at 16:08
  • 1
    If you're not redirecting stdout to a disk file, I see no reason to change it to binary mode. – Eryk Sun Nov 16 '16 at 16:16
  • 1
    Where is your script trying to create the file? In the working directory? The working directory depends on how python.exe is started. It has nothing to do with the location of "file.py". – Eryk Sun Nov 16 '16 at 16:17
  • At the moment I am able to reproduce this in a PDB when it is invoked as python file.py . So I don't see any files involved unless 1 itself is a file. Btw I will check about the point you mentioned as to why we need to do this if we are not writing to disk (we ideally write to DB). Probably there is a scenario where we write to Temporary File though later but does it matter here I wonder. – Nishant Nov 16 '16 at 16:21
  • @eryksun Do we use this if we write to stdout or a Temporary file? – Nishant Nov 16 '16 at 16:41
  • 1
    The Windows C runtime implements both low (POSIX) and standard I/O. Python 2.x `file` objects wrap C standard I/O `FILE` streams, which wrap POSIX integer file descriptors (which are in turn mapped to the actual Windows `File` handles). The `fileno` method returns the file descriptor. Initially the C runtime's 3 standard `FILE` streams -- `stdin`, `stdout`, and `stderr` -- are opened as file descriptors 0, 1, and 2, and initially Python's `sys.stdin`, `sys.stdout`, and `sys.stderr` wrap the corresponding C `FILE` streams. – Eryk Sun Nov 16 '16 at 16:42
  • Indeed @eryksun I am getting IOError Errno 12 not enough space, it was my mistake to assume it is something else. – Nishant Nov 17 '16 at 10:54

1 Answers1

1

This answer explains what is going on in principle, and the traceback would reveal an exact failed call.

In particular, an attempt to sys.stdin.read() a block of data larger than 32767 bytes will cause IOError "[Errno 12] Not enough space", when there is not enough data to read. Consider running the following example on Windows 7:

python -c "import sys; data = sys.stdin.read(32768)"
void
  • 2,759
  • 12
  • 28
  • 1
    This is the problem when people paraphrase error messages. The OP says it's an `IOError` "No Disk Space". But the Windows CRT doesn't have such a message. There's `ENOMEM` (12, "Not enough space") and `ENOSPC` (28, "No space left on device"). – Eryk Sun Nov 16 '16 at 17:20
  • 1
    You're giving an example of `ENOMEM`. Apparently you're using Windows 7 or earlier, which uses the old ALPC console subsystem. This sets up a 64K shared heap between conhost.exe and the client process, python.exe. You're getting a memory related error from trying to allocate a read buffer that's bigger than the largest available block in the shared heap. Windows 8+ has a totally different console subsystem that uses a driver instead of ALPC and doesn't have this problem. – Eryk Sun Nov 16 '16 at 17:22
  • 1
    Thank you for the correction, you are right in all details. My example is not much relevant to the question. – void Nov 16 '16 at 17:36
  • 2
    You were too hasty to assume the example isn't relevant. Nishant's code is getting an `ENOMEM` error -- probably from writing to `stdout` in binary mode when it's a console. Prior to Windows 8, this is limited to the largest free block in the 64K heap that's shared with conhost.exe. Thus far in the comments I haven't seen a reason to set `stdout` to binary mode -- which would be fine if `stdout` were redirected to a disk file or pipe, but not a console. – Eryk Sun Nov 17 '16 at 10:40
  • @void I think there is a connection with your example. eryksun helped me correct mu trackeback which was indeed "No Space Error". I will try to understand this better. It is bigger than 64k around 100k in my case and also I am using stdout() not stdin(). – Nishant Nov 17 '16 at 10:56
  • 2
    @Nishant, are you actually writing the pdf to `stdout`? If so, redirect to a file, e.g. `python script.py > output.pdf`. The shell will open `output.pdf` and pass the handle to python.exe as its `StandardOutput` handle. At startup the C runtime in Python will map this to low I/O file descriptor 1 in text mode, which is wrapped by the C `stdout` stream, which is wrapped by Python `sys.stdout`. Then your script has to change file descriptor 1 (i.e. `sys.stdout.fileno()`) to binary mode, as you're already doing, to write to "output.pdf" without the C runtime converting `"\n"` to `"\r\n"`. – Eryk Sun Nov 17 '16 at 11:09
  • Ok @eryksun, this particular issue happens for PDF's of `>64k` size in `Windows 7` with the only workaround being writing to a file like `output.pdf`? Since we can't do `output.pdf` as it is a Web Application Server that actually runs and not a script like this probably we would avoid this. I don't think this is needed when writing to Temporary Files just in case we are doing that as well. – Nishant Nov 17 '16 at 11:42
  • 1
    @Nishant, but the error seems to be from writing to the console, and I can't see how writing out the contents of a PDF to the console would ever be useful. If it's going to a database it should be rendered in memory or a temporary file. – Eryk Sun Nov 17 '16 at 11:52
  • @void you can perhaps remove the strike through and I will accept the answer. – Nishant Nov 17 '16 at 12:00
  • 1
    @eryksun: what is more interesting, if there is enough data to read, setting large buffer size does not cause ENONEM, the following works fine: """python -c "print 'A' * 32768" | python -c 'import sys; data= sys.stdin.read(32768)'""" – void Nov 17 '16 at 15:07
  • 1
    In that case stdin is a pipe not a console. An I/O operation on a kernel `File` object that's opened for a real device (e.g. `\Device\NamedPipe` for a pipe) may copy the user-mode buffer to one in kernel address space or it might simply operate in the context of the calling process to read the user-mode buffer directly. The only limit is overall system resources, unlike with the tiny shared heap between the client and conhost.exe for Windows 7 console files. – Eryk Sun Nov 17 '16 at 16:17