5

I want to open a shared object as a data file and perform a verification check on it. The verification is a signature check, and I sign the shared object. If the verification is successful, I would like to load the currently opened shared object as a proper shared object.

First question: is it possible to call dlopen and load the shared object as a data file during the signature check so that code is not executed? According to the man pages, I don't believe so since I don't see a flag similar to RTLD_DATA.

Since I have the shared object open as a data file, I have the descriptor available. Upon successful verification, I would like to pass the descriptor to dlopen so the dynamic loader loads the shared object properly. I don't want to close the file then re-open it via dlopen because it could introduce a race condition (where the file verified is not the same file opened and executed).

Second question: how does one pass an open file to dlopen using a file descriptor so that dlopen performs customary initialization of the shared object?

jww
  • 97,681
  • 90
  • 411
  • 885
  • I am not sure to understand your question, even if I replied to some points. You really should explain a lot more your motivations. Why do you want to do all this? What kind of software are you thinking of? Show some use case scenarii. – Basile Starynkevitch Apr 24 '13 at 05:44
  • You did **not** define your notion of "approved" or "authentic". You could have some independent way of storing the hash of "good" plugins somewhere else (e.g. a database, a configuration file), and check the hash code of the plugin before `dlopen`. Still there is probably no way to be fail-proof. Otherwise you could also generate yourself the C code of the plugin, do whatever checks you want during this C generation, then compile it and `dlopen` it. FWIW [GCC MELT](http://starynkevitch.net/Basile/gcc-melt/) is doing this. – Basile Starynkevitch Apr 24 '13 at 05:57
  • Please edit your question to improve it. – Basile Starynkevitch Apr 24 '13 at 17:03
  • Explain all the things in the comments, in a more structured way, within you edited question. Define in your question what "authentic" or "approved" plugins means? – Basile Starynkevitch Apr 24 '13 at 17:32

1 Answers1

5

On Linux, you probably could dlopen some /proc/self/fd/15 file (for file descriptor 15).

RTLD_DATA does not seems to exist. So if you want it, you have to patch your own dynamic loader. Perhaps doing that within MUSL Libc could be less hard. I still don't understand why you need it.

You have to trust the dlopen-ed plugin somehow (and it will run its constructor functions at dlopen time).

You could analyze the shared object plugin before dlopen-ing it by using some ELF parsing library, perhaps libelf or libbfd (from binutils); but I still don't understand what kind of analysis you want to make (and you really should explain that; in particular what happens if the plugin is indirectly linked to some bad behaving software). In other words you should explain more about your verification step. Notice that a shared object could overwrite itself....

Alternatively, don't use dlopen and just mmap your file (you'll need to parse some ELF and process relocations; see elf(5) and Levine's Linkers and Loaders for details, and look into the source code of your ld.so, e.g. in GNU glibc).

Perhaps using some JIT generation techniques might be useful (you would JIT generate code from some validated data), e.g. with GCCJIT, LLVM, or libjit or asmjit (or even LuaJit or SBCL) etc...

And if you have two file descriptors to the same shared object you probably won't have any race conditions.

An option is to build your ad-hoc static C or C++ source code analyzer (perhaps using some GCC plugin provided by you). That plugin might (with months, or perhaps years, of development efforts) check some property of the user C++ code. Beware of Rice's theorem (limiting the properties of every static source code analyzer). Then your program might (like my manydl.c does, or like RefPerSys will soon do, in mid 2020, or like the obsolete GCC MELT did a few years ago) take as input the user C++ code, run some static analysis on that C++ code (e.g. using your GCC plugin), compile that C++ code into a temporary shared object, and dlopen that shared object. So read Drepper's paper How to Write Shared Libraries.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 2
    Hi Basile. "You have to trust the dlopen-ed plugin somehow." Actually we don't :) Trust is what we use when we don't have effective security controls in place. In this case, I have at least two controls: (1) a signature check; or (2) a self-authenticating URL. – jww Apr 24 '13 at 07:43
  • "RTLD_DATA does not seems to exist." Yes, correct (it's the one I would like to use for simplicity if it existed). – jww Apr 24 '13 at 07:46
  • 1
    "... you probably could dlopen some /proc/self/fd/15 file." That's clever. I'll try it when I get a chance. – jww Apr 24 '13 at 07:47
  • @BasileStarynkevitch :`Alternatively, don't use dlopen and just mmap your file`This would require parsing ᴇʟꜰ header isn’t it ? *(alors que `dlopen()` le fait tout seul)* – user2284570 Aug 05 '16 at 22:36
  • "This would require parsing ᴇʟꜰ header" - you'll still need to `dlopen` it afterwards, after verification (trust me, you don't want to handle runtime relocation processing yourself) which introduces a race condition. – yugr Jan 08 '20 at 12:09
  • No, you can call `mmap` directly, exactly like `dlopen` does; of course you will need to handle ELF relocation. But you could find existing open source code doing that. – Basile Starynkevitch Jan 08 '20 at 12:12
  • Anyone tested "/proc/self/fd/15" solution - does it really works? I'am looking for the Linux version of https://stackoverflow.com/questions/5053664/dlopen-from-memory/42196828#42196828 – Dmitry Sychov Feb 01 '20 at 19:11
  • @DmitrySychov: Please explain (perhaps in a new question) why you need such a solution and what is the actual use case. – Basile Starynkevitch Feb 01 '20 at 20:07
  • @BasileStarynkevitch The link(https://stackoverflow.com/questions/5053664/dlopen-from-memory/42196828#42196828) basically explains it all. It's very useful when someone does not want to leave the traces of .so on the disk(security reasons) but load->init .so image purely from the mem instead.. please see the link for FreeBSD solution. – Dmitry Sychov Feb 02 '20 at 20:00
  • But then, why do use an `.so` file ? The correct approach is to generate code in memory. Look into [asmjit](https://asmjit.com/) or similar libraries. **Don't use files** – Basile Starynkevitch Feb 02 '20 at 20:02
  • Its just complex - FreeBSD solution is slick and simple. Why should I generate code in memory when I can just map its compiled binary presentation into shared block and load it like a file.. – Dmitry Sychov Feb 02 '20 at 20:18
  • Generating code is hard, whatever way you do it. Even generating temporary C files and compiling them (I love doing that) is not trivial – Basile Starynkevitch Feb 02 '20 at 20:21