2

Trying to reason about the CPython source and was curious about the built-in open() method.

This method is defined in _pyio.py and returns a FileIO object, so I dug through source and found that (on Windows) there is a call to _wopen (source line). Interestingly enough, I stumbled into fileutils.c where _Py_open is defined and subsequently _Py_open_impl. The latter makes a call to open (source line) which has a different signature than _wopen which I presume is referencing _wfopen; however, below that there are _Py_wfopen, _Py_fopen and _Py_fopen_obj. Their comment lines seem to indicate that they are wrappers around the C functions provided from #include's, so I know they're calling the originals and extending their functionality.

I'm not a C person by any means, mostly I can dig around code for debugging. This, however, has me lost. How are all these methods tied together (on Windows)? So far I have:

open() -> io.py -> _pyio.py (_io) -> _iomodule.c -> ?

Not seeing where _Py_fopen or _Py_wfopen are called explicitly called (or used to wrap library functions) other than in main.c for startup file operations.

  • You're looking at the wrong implementation. The implementation in `_pyio.py` isn't actually used for anything but tests. – user2357112 Sep 17 '20 at 16:03
  • The `fileutils.c` stuff isn't relevant either. That's used by other stuff, not by the Python-level `open` function. – user2357112 Sep 17 '20 at 16:08
  • You can get some information by running a native code debugger and breaking at the underlying system call. On Linux/x86_64, the system call is called `open` and the file name is [passed in the register `$rdi`](https://stackoverflow.com/questions/26892091/gdb-conditional-break-on-function-parameter/26892534#26892534) so: `gdb -ex "break open if ((char*)\$rdi)[0]=='w'" -ex run -ex backtrace --args /usr/bin/python3 -c 'open("wibble")'` gives me a native code backtrace for `open("wibble")`. How informative this is depends on how the Python interpreter was compiled. – Gilles 'SO- stop being evil' Sep 17 '20 at 16:22
  • @user2357112supportsMonica what do you mean it’s only used for testing? When I use`inspect.getmodule(open)` it returns *io.py*. So where is the implementation of the built-in `open` if not there? –  Sep 19 '20 at 00:06
  • @pasta_sauce: It returns `io.py`. *Not* `_pyio.py`. `_pyio.py` is only used for testing. – user2357112 Sep 19 '20 at 00:19
  • @user2357112supportsMonica Inside `io.py`, the `open` method is imported from `_io`, which is where? –  Sep 21 '20 at 12:02
  • That's `_iomodule.c`. – user2357112 Sep 21 '20 at 12:04
  • @user2357112supportsMonica And as I dive further into the internals of Python I find out more; that leads me to ask: where is `_iomodule.c`? I've downloaded a fresh Python executable and installed it. I can see `io.py`, but I don't see `_iomodule.c`. Presumably, its either compiled inside the interpreter or in `python3.dll` or `python36.dll`. If so, how does the process work to expose the builtin `open` from wherever that C file is? –  Sep 21 '20 at 12:07

1 Answers1

1

The latter makes a call to open (source line) which has a different signature than _wopen which I presume is referencing _wfopen

What you mean is not clear but the call to open is referencing the unix open(2) syscall, nothing to do with Windows.

open() -> io.py -> _pyio.py (_io) -> _iomodule.c -> ?

_iomodule.c defines _io_open_impl which instantiates PyFileIO_Type from fileio.c.

That actually opens a file in _io_FileIO___init___impl which, after some faffing around, simply calls _wopen on windows and open(2) elsewhere: https://github.com/python/cpython/blob/master/Modules/_io/fileio.c#L383

Masklinn
  • 34,759
  • 3
  • 38
  • 57
  • So is the built-in `open` defined one the _iodmodule.c? It’s sequence matches that of what’s in _pyio.py. –  Sep 19 '20 at 00:07
  • Yes. The io module was originally implemented in pure Python. _pyio is that, and was largely deprecated when the module was reimplemented in C for performances, memory, … `_pyio` remains as a convenient way to test features & a copy for alternate implementations (e.g. a third-party implementation can just grab _pyio and get a working io module). The `open` builtin you normally get is the one which lives in `_iomodule.c`. – Masklinn Sep 19 '20 at 09:27