9

Using function interposition for open() with Python doesn't seem to work after the first few calls. I suspect Python is doing some kind of initialization, or something is temporarily bypassing my function.

Here the open call is clearly hooked:

$ cat a
hi
$ LD_PRELOAD=./libinterpose_python.so cat a
sandbox_init()
open()
hi

Here it happens once during Python initialization:

$ LD_PRELOAD=./libinterpose_python.so python
sandbox_init()
Python 2.7.2 (default, Jun 12 2011, 20:20:34) 
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
open()
>>> 
sandbox_fini()

Here it doesn't happen at all, and there's no error to indicate the file handle had write privileges removed:

$ LD_PRELOAD=./libinterpose_python.so python3 -c 'b = open("a", "w"); b.write("hi\n"); b.flush()'
sandbox_init()
sandbox_fini()

The code is here. Build with make -f Makefile.interpose_python.

A full solution is given here.

Community
  • 1
  • 1
Matt Joiner
  • 112,946
  • 110
  • 377
  • 526
  • 2
    One question, though this gets you no closer to solving your problem... Why don't you set up `next_open` in `sandbox_init`? – Omnifarious Jun 21 '11 at 07:51
  • Is it possible that Python is statically compiled? – X-Istence Jun 21 '11 at 08:41
  • @Omnifarious: I was so paranoid I was doing something I basically copied verbatim an example from the net. I definitely intended to do it that way however. – Matt Joiner Jun 21 '11 at 09:36
  • @X-Istence: If that's the default mode for Python, then you've nailed it, but it seems unlikely. Not much statically compiles libc, but I'll check. – Matt Joiner Jun 21 '11 at 09:37
  • @Matt Joiner: I was just thinking out loud =). Seems @zvrba has figured it out =) – X-Istence Jun 21 '11 at 16:13
  • 1
    Linux really needs a way to interpose your own system call handling layer when launching a process. I've come to the conclusion that the system call API is a singleton with all the attendent headaches and security risks. – Omnifarious Jun 21 '11 at 18:33
  • @Omnifarious: What do you mean? You want to interpose without using LD_PRELOAD? – Matt Joiner Jun 22 '11 at 00:38
  • 1
    @Matt Joiner: Yes. LD_PRELOAD is an unreliable way to interpose. Someone could just invoke the system call directly using the appropriate assembly instructions. I want OSes to be more capability based. A program won't have access to anything it wasn't given by its runtime environment, and that's enforced at the OS level. – Omnifarious Jun 22 '11 at 05:35
  • @Omnifarious: Ptrace... Also I've seen sandboxing that hooks the system call interface somehow. – Matt Joiner Jun 22 '11 at 10:49
  • @Omnifarious: And here is a project that uses it, bathe in teh glory: http://fakeroot-ng.lingnu.com/index.php/PTRACE_LD_PRELOAD_comparison – Matt Joiner Jun 24 '11 at 00:28
  • @MattJoiner: Your Solution section should go in an Answer. – bukzor Dec 18 '11 at 04:00
  • @bukzor: Done, thanks. http://stackoverflow.com/a/8549881/149482 – Matt Joiner Dec 18 '11 at 04:49

3 Answers3

8

There are open() and open64() functions, you might need to redefine both.

zvrba
  • 24,186
  • 3
  • 55
  • 65
2

You should be able to find out what your python process is actually doing by running it under strace (probably without your pre-load).

My python3.1 (on AMD64) does appear to use open:

axa@ares:~$ strace python3.1 -c 'open("a","r+")'
...
open("a", O_RDWR)                       = -1 ENOENT (No such file or directory)
Andrew Aylett
  • 39,182
  • 5
  • 68
  • 95
  • 1
    Curiously, it also attempts to open a file called ``, and attempts to use it as a TTY if it exists... – Andrew Aylett Jun 21 '11 at 10:46
  • I've actually tried this already, it still uses open as expected, none of my changes appear to be made either. They are made however on some other programs (and amusing crashes abound). – Matt Joiner Jun 21 '11 at 11:09
  • That sounds extremly strange; mine doesn't. You should have that looked at. – Teddy Jun 21 '11 at 11:10
  • 1
    Strace shows system calls and LD_PRELOAD affects library function. They are often quite related (the `open()` libc function makes the `open()` syscall), but there is nothing that could prevent the `open()` syscall being called by any other library function. E.g. `opendir()` uses the `open()` syscall too. – Jacek Konieczny Jun 21 '11 at 12:08
  • @Jacek Konieczny: That explains a lot. The open64 calls are converted into system calls as well. This is why the open64 library call maps to the open system call. – Matt Joiner Jun 22 '11 at 00:39
1

It turns out there is an open64() function:

$ objdump -T /lib32/libc.so.6  | grep '\bopen'
00064f10 g    DF .text  000000fc  GLIBC_2.4   open_wmemstream
000cc010 g    DF .text  0000007b  GLIBC_2.0   openlog
000bf6d0  w   DF .text  000000b6  GLIBC_2.1   open64
00094460  w   DF .text  00000055  GLIBC_2.0   opendir
0005f9b0 g    DF .text  000000d9  GLIBC_2.0   open_memstream
000bf650  w   DF .text  0000007a  GLIBC_2.0   open
000bf980  w   DF .text  00000081  GLIBC_2.4   openat
000bfb90  w   DF .text  00000081  GLIBC_2.4   openat64

The open64() function is a part of the large file extensions, and is equivalent to calling open() with the O_LARGEFILE flag.

Running the example code with the open64 section uncommented gives:

$ LD_PRELOAD=./libinterpose_python.so python3 -c 'b = open("a", "w"); b.write("hi\n"); b.flush()'
sandbox_init()
open64()
open64()
open64()
Traceback (most recent call last):
  File "<string>", line 1, in <module>
open64()
open64()
open64()
open64()
open64()
open64()
open64()
IOError: [Errno 9] Bad file descriptor
sandbox_fini()

Which clearly shows all of Python's open calls, and several propagated errors due to the write flag being stripped from the calls.

Community
  • 1
  • 1
Matt Joiner
  • 112,946
  • 110
  • 377
  • 526