The normal method of generating x86 PIC binaries is by getting a pointer to the GOT and then offsetting it with the position of the desired data. Generally the access to the data requires an extra indirection, since the GOT stores the pointer to the data instead of the data itself for global non-static variables.
My question is what is the real reason to use this scheme? This requires 2 relocation entries for each data access (assuming the got pointer is not yet available): one to get the got pointer related to the current instruction position and other to get the offset of the desired symbol inside the got. In ELF binaries, the data segment comes immediately after the text segment when the OS loads it in the process address space, so I wonder why the compiler won't generate a single relocation entry directly to the data and the linker would just directly relocate the data itself?
E.g. the compiler would still be responsible for acquiring the PC value for the current instruction (via the famous call/pop x86 hack) but instead of asking for a got pointer relocation, it would ask directly for the data. The linker should be able to satisfy that since the data will always come after the code, so the relative address of the data is already known at link time. What am I missing and why the GOT is necessary in this scenario?