Does 32bit x86 code need to be specially PIC-compiled for shared library files?

Question

Compiling code to an object file needs to be done position-independent if the object file is intended to be loaded as a shared library (.so), because the base virtual address that the shared object file is loaded into in different processes may be different.

Now I didn't encounter errors when I tried to load an .so file compiled and linked without the -fpic GCC option on 32bit x86 computers, while it fails on 64bit bit x86 computers.

Random websites I found say that I don't need -fpic on 32bit because code compiled without -fpic works by coincidence according to the X86 32bit ABI also when used in a position-independent manner. But I still found software that ship with separate versions of libraries in their 32bit versions: One for PIC, and one for non-PIC. For example, the intel compiler ships with libirc.a and libirc_pic.a, the latter being compiled for position-independent mode (if one wants to link that .a file into an .so file).

I wonder what the precise difference between using -fpic and not using it is for 32bit code, and why some packages, like the intel compiler, still ship with separate versions of libraries?

Did you try non -fpic compile of code using TLS? Or load several non-pic libraries with overlapping memory ranges? Separate static libs are to link into programs statically (libirc.a; no pic is a bit faster) and into .so libraries statically (_pic.a version). — osgx, Aug 05 '11 at 19:55
See http://stackoverflow.com/questions/3146744/difference-in-position-independent-code-x86-vs-x86-64 — AProgrammer, Aug 05 '11 at 21:07

score 10 · Answer 1 · answered Aug 05 '11 at 20:05

10

It's not that non-PIC code works "by coincidence" on x86 (32-bit). It's that the dynamic linker for x86 supports the necessary "textrels" needed to make it work. This comes at a very high cost in memory consumption and startup time, since basically the entire code segment must be patched up at load time (and thus becomes non-shareable memory).

The dynamic linker maintainers claim that non-PIC shared libraries can't be supported on x86_64 because of fundamental issues in the architecture (immediate address displacements can't be larger than 32-bit) but this issue could be easily solved by just always loading libraries in the first 4gb of virtual address space. Of course PIC code is very inexpensive on x86_64 (PIC isn't a performance-killer like it is on 32-bit x86) so they're probably right to keep it unsupported and prevent fools from making non-PIC libraries...

answered Aug 05 '11 at 20:05

R.. GitHub STOP HELPING ICE

208,859
35
376
711

Thanks, that makes sense. If I compile a PIC lib on x86, it does not use textrels anymore, I suspect? Why is it still a performance-killer then? – Johannes Schaub - litb Aug 05 '11 at 20:12
6

PIC code requires an extra register to hold the base address. On x86 you already have too few registers, so reserving one more makes the compiler spill registers to memory even more often. – Employed Russian Aug 05 '11 at 20:17
2

PIC is a performance-killer on x86 because loading the GOT register is expensive. It requires a function call and reading/saving the return address off the stack. x86_64 has `eip`-relative addressing making it so no GOT register is needed. – R.. GitHub STOP HELPING ICE Aug 05 '11 at 20:18
Also what Employed Russian said. – R.. GitHub STOP HELPING ICE Aug 05 '11 at 20:19
Ah I see now. I think for large libraries that don't need to run particularly fast, I better take the `_pic.a` version (because that won't require relocating `.text`). For small libs that need to run fast I will take the non-pic version and let it do the textrel. – Johannes Schaub - litb Aug 05 '11 at 20:42
You can have a shared non pic library for x86_64 with -mcmodel=large (and a recent enough gcc) – AProgrammer Aug 05 '11 at 21:08
@R.. Interesting explanation. From my understanding, non-PIC libraries consumes memory and processing, but just during startup. Afterwards, it gets faster (because one machine register is freed). Therefore, if startup overhead is not an issue, is it better to use non-PIC libraries in x86 (32-bit) machines? – alecov Aug 05 '11 at 21:19
1

@Alek: Still non-PIC libraries will consume a lot more memory, and they'll be faster than PIC but not as fast as static linking because function calls will go through the PLT. For libraries where performance matters that much (like `libavcodec` or `libx264`) I would just avoid generating `.so` libraries so the library always ends up static-linked into the main program binary. – R.. GitHub STOP HELPING ICE Aug 05 '11 at 23:07

score 0 · Answer 2 · answered Aug 05 '11 at 19:54

0

the base virtual address that the shared object file is loaded into in different processes may be different

Because shared objects usually load at their preferred address, they may appear to work correctly. But fPIC is a good idea for all shared code.

I believe the reason that there aren't often two versions of the library is that many distributions use fPIC as the default for all code.

answered Aug 05 '11 at 19:54

Ben Voigt

277,958
43
419
720

3

Shared objects are usually linked to load at address 0, and they most definitely do not load there. – Employed Russian Aug 05 '11 at 20:17
What @Employed Russian said. Also, you may `prelink` libraries to make them have a preferred base address, but it is a separate (and optional) step from normal linking. – ninjalj Aug 05 '11 at 20:49
1

@Employed Russian: Is that a linux-only thing? Other OSes aren't so stupid as to default to 100% base address conflicts. – Ben Voigt Aug 06 '11 at 01:44
@Ben how is this an OS thing at all? It seems to be impossible to coordinate all the authors of libraries that a random program uses to use non-overlapping addresses ranges. How do the authors know about each other libraries? – Johannes Schaub - litb Aug 06 '11 at 09:30
@Johannes: Keyword "usually". The address space is quite large compared to the code segment sizes, so conflicts are rare. It's basically a birthday problem, but when you discount JIT compilation (which can dynamically pick an unused address) and libraries provided by the OS vendor (which can be carefully packed not to conflict), few programs load more than half-a-dozen third-party libraries. – Ben Voigt Aug 06 '11 at 15:17
Every UNIX linker I know of (except HP-UX/ia64 one) links shared libraries to load at 0 by default, and very few application programmers ever change the default. The Microsoft linker defaults to 0x10000000. You can modify this with a post-link step (prelink on Linux, REBASE on Win32). REBASE is used much more commonly then prelink, because UNIX loaders do function resolution lazyly, and few UNIX apps use lots of shared libraries. – Employed Russian Aug 07 '11 at 00:28
@Employed: And right you are about 0x10000000. Don't know why I thought that Visual C++ chose a random address and passed it to the linker. Maybe some earlier version did that, but no longer. Still, the majority of DLLs have non-default base addresses (run Dependency Walker on any application on your system to see). – Ben Voigt Aug 07 '11 at 01:23
Anyway, with Vista ASLR, rebasing only needs to be done once each time the computer is booted (and a random address is used, which reduces collisions). – Ben Voigt Aug 07 '11 at 01:36

Does 32bit x86 code need to be specially PIC-compiled for shared library files?

2 Answers2