1
void foo(){
    ...
}

Compiling this to assembly, it seems that gcc on linux will create label foo as an entry point but label _foo on OSX.

We can, of course, do an OS-specific selection whenever we need a label, but this is cumbersome.

Is there any way to suppress this so that the labels on both systems are the same (preferably one that is also Windows-compatible)?

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
User1291
  • 7,664
  • 8
  • 51
  • 108
  • 2
    `-f(no-)leading-underscore` but it doesn't work on all targets and produces code that does not conform to the platform ABI. – Marc Glisse May 20 '16 at 08:30
  • **Why do you ask?**; for portable C or C++ code, it should not matter at all. The assembler name is an implementation detail. – Basile Starynkevitch May 20 '16 at 08:31
  • instead of OS-specific labels, you could have a OS-specific preprocessor that modifies labels to the type you need on each OS ? – Tommylee2k May 20 '16 at 10:28

2 Answers2

2

No. It's part of the name mangling specifications of the platform.

You can't change that. You're still writing assembly. Don't expect it to be portable in any way, that's what C was invented for.

Leandros
  • 16,805
  • 9
  • 69
  • 108
  • 3
    Perhaps [ABI](https://en.wikipedia.org/wiki/Application_binary_interface) is more correct than [name mangling](https://en.wikipedia.org/wiki/Name_mangling) which for GCC is relevant for C++ mostly (not for C code). – Basile Starynkevitch May 20 '16 at 08:30
  • @BasileStarynkevitch It's part of the ABI, that's correct. It doesn't change the fact that it is name mangling. – Leandros May 20 '16 at 08:33
  • 1
    No really, [name mangling](http://en.wikipedia.org/wiki/Name_mangling) is defined as *encoding additional information* in the name. Here nothing additional is encoded. All C identifiers have an underscore prepended at link-time. But I am nitpicking, and I did upvote your answer. – Basile Starynkevitch May 20 '16 at 08:39
  • Fun fact: here the problem is *actually* C (or better C compilers) rather than assembly, despite your blaming on the latter. – Margaret Bloom May 20 '16 at 19:37
0

The early C compilers decorated the name of the functions with an _ to avoid name clashing when linking against the already developed and huge assembly libraries of the times.
Credits for this information go to this excellent old answer.

Today this is not needed anymore but the tradition is still sticking around, mostly for backward compatibility, even though some systems are getting rid of it.

This is not an OS issue, OSes are completely orthogonal to programming languages, name decoration is not something defined by the OS ABI, it is a matter of the compiler/linker designers; though standards have been created to reduce the incompatibilities and an ABI may suggest their use.

In order to fully understand how you can mitigate your problem it is worth noting that while the OS API are language agnostic, a C program rarely invoke them directly, more likely it uses the C run-time.
The C run-time is usually statically linked and it expects names to be decorated according to the scheme of the compiler used to create it.
So if you need to use the C run-time you have to stick with the same name decoration as your system components are using.


This last point rules out the -fno-leading-underscore option as it will generate a linker error on the relevant platforms.

It is better to work on the assembly files, since you have the freedom to define and imports names exactly as typed. Furthermore usually the assembly code is limited.

If you are using NASM1 there is a nice trick you can use, it's called Macro indirection and it allow you to append a symbol, define at command line, to a name. Consider:

BITS 32
mov eax, %[p]data

_data db 0
data db 0

If you compile this file twice, the first time as nasm -Dp=_ ... and the second as nasm -Dp= ..., by inspecting the immediate value in the generated opcode for mov eax, %[p]data you can check that in the first case it has been translated as mov eax, _data and in the second as mov eax, data.

Assuming you access external symbols by declaring them as EXTERN symn (precise syntax is irrelevant here), you can define a macro PEXTERN that works like the directive EXTERN but import the symbol with or without a leading underscore based on the value of the macro p (you can change this name) and define an alias for it so that its imported name is the same regardless.

BITS 32

%macro PEXTERN 1
 EXTERN %[p]%1
 %ifnidn %1, %[p]%1
    %define %1 %[p]%1
 %endif
%endmacro


PEXTERN foo
PEXTERN bar

mov eax, foo
call bar

Running nasm -Dp= -e ... and nasm -Dp=_ -e ... produces the listings

extern foo        extern _foo
extern bar        extern _bar

mov eax, foo      mov eax, _foo
call bar          call _bar

You'll need to update the building scripts/Makefiles, off the top of my head you can use two methods:

  1. Detect the OS type and properly define the symbol p.
    With Makefiles this may be easier.

  2. Try compiling a test program.
    Write a minimal C program that import/export a function and a minimal assembly file that export/import that function.
    Define the symbol as _ and try to assemble + compile (redirecting everything into /dev/null).
    If it fails redefine the symbol as empty.

Note that besides names, individual OSes may need specific assembly flags, so a universal building script maybe more involved but not necessarily unmanageable. You'll end up needing something like Cygwin for Windows.


1 If not, check if you can port the idea into your assembler.

Community
  • 1
  • 1
Margaret Bloom
  • 41,768
  • 5
  • 78
  • 124