3

I'm currently trying to figure out the way to produce equivalent assembly code from corresponding C source file.

I've been using the C language for several years, but have little experience with assembly language.

I was able to output the assembly code using the -S option in gcc. However, the resulting assembly code contained call instructions which in turn make a jump to another function like _exp. This is not what I wanted, I needed a fully functional assembly code in a single file, with no dependency to other code.

Is it possible to achieve what I'm looking for?

To better describe the problem, I'm showing you my code here:

#include <math.h>
float sigmoid(float i){
    return 1/(1+exp(-i));
}

The platform I am working on is Windows 10 64-bit, the compiler I'm using is cl.exe from MSbuild.

My initial objective was to see, at a lowest level possible, how computers calculate mathematical functions. The level where I decided to observe the calculation process is assembly code, and the mathematical function I've chosen was sigmoid defined as above.

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
  • 2
    Possibly objdump? – David Wohlferd Jun 02 '18 at 22:27
  • 4
    In general, this is not possible if you call external functions. The code of these functions is not known to the C compiler and it cannot generate assembly for them. What are you trying to achieve? – fuz Jun 02 '18 at 22:33
  • Most likely, the C code contained function calls that are reflected in the assembler. Without seeing the C code, it’s hard to be sure, of course. – Jonathan Leffler Jun 02 '18 at 22:34
  • If you want external functions, you must either undergo linking normally (and thus be limited to disassembly, which retains less information), or use whole-program optimization ... and `libc` implementations are generally not WPOable. – o11c Jun 02 '18 at 22:56
  • Although ideally assembly language has a one to one relationship with machine code (for instructions, a fair amount, esp with compiler generated code is not instructions but directives, labels, pseudocode, etc). but as you may have seen or will see that is not always the case. assembly language is defined by the assembler, the program, not expected to be universal for the target, so you will run into that, different assemblers have different syntax. then there is the instruction set, overloaded instructions like mov in x86 that result in many different possible machine code instrutions. – old_timer Jun 03 '18 at 03:33
  • and then if this is a program that is linked there are many/dozens/hundreds of different files that were assembled then linked to make your program, depends on how many library calls you make if any. So as David above mentioned you can disassemble, which assuming the disassembly is good (for variable length instruction sets like x86 it is not expected to be perfect) you get the real machine code, the actual instruction that was chosen, other than the risk of a failed or misleading disassembly, the disassembly is the best there is to see what is going on. – old_timer Jun 03 '18 at 03:35
  • if your program is using a shared library, like you would typically see on a program compiled with C library calls on an operating system, unless you specify otherwise, will not show you the library but only the way the library is connected to the binary, so you wont get to see the library implementation in machine code/disassembly, just your program. – old_timer Jun 03 '18 at 03:37

2 Answers2

1

_exp is the standard math library function double exp(double); apparently you're on a platform that prepends a leading underscore to C symbol names.

Given a .s that calls some library functions, build it the same way you would a .c file that calls library functions:

gcc foo.S -o foo  -lm

You'll get a dynamic executable by default.


But if you really want all the code in one file with no external dependencies, you can link your .c into a static executable and disassemble that.

gcc -O3 -march=native foo.c -o foo -static -lm
objdump -drwC -Mintel foo > foo.s

There's no guarantee that the _exp implementation in libm.a (static library) is identical to the one you'd get in libm.so or libm.dll or whatever, because it's a different file. This is especially true for a function like memcpy where dynamic-linker tricks are often used to select an optimal version (for your CPU) at run-time.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • I tried your method, however the assembly code still contains a call instruction. The platform I'm working on is cygwin, x86_64. I should have provided the C code I'm using. I'll update my question as soon as I get access to my PC. – Kinnefix Kim Jun 03 '18 at 23:57
  • 1
    Of course it contains `call` instructions between functions. But I guess you mean to an external DLL? You can't make static Windows executables, because there is no stable system-call ABI. The only stable way to use the Win32 / NT APIs is via DLL function calls. – Peter Cordes Jun 04 '18 at 00:03
  • I think I didn't clarify what I'm trying to achieve. I apologize :(. I will add some further explanation of my problem. – Kinnefix Kim Jun 07 '18 at 06:23
  • @KinnefixKim: If you statically linked `libm`, this answer would work. Otherwise you can disassemble the `_exp` implementation in your math library, or single-step into it while running your code inside a debugger. See also [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116). – Peter Cordes Jun 07 '18 at 07:21
  • @KinnefixKim: Or for the exponential function specifically, see [Fastest Implementation of Exponential Function Using SSE](https://stackoverflow.com/a/47025627) for discussion of efficient x86 implementations. Q&As on SO about low-level implementations of `exp` will mostly be about SIMD, because normal math libraries already handle scalar. You can compile C with intrinsics into asm, but each of those `_mm_` functions is an intrinsic for one machine instruction. The real trick for `exp` is to use the exponent/mantissa format of IEEE floats and a polynomial approximation. – Peter Cordes Jun 07 '18 at 07:25
0

It is not possible in general, there are exceptions sure, I could craft one so that means other folks can too, but it isnt an interesting program.

Normally your C program, your main() entry point is only a percentage of the code. There is a bootstrap that contains the actual entry point for the operating system to launch your program, this does some things that prepare your virtual memory space so that your program can run. Zeros .bss and other such things. that is often and or should be written in assembly language (otherwise you get a chicken and egg problem) but not an assembly language file you will see unless you go find the sources for the C library, you will often get an object as part of the toolchain along with other compiler libraries, etc.

Then if you make any C calls or create code that results in a compiler library call (perform a divide on a platform that doesnt support divide, perform floating point on a platform that doesnt have floating point, etc) that is another object that came from some other C or assembly that is part of the library or compiler sources and is not something you will see during the compile/assemble/link (the chain in toolchain) process.

So except for specifically crafted trivial programs or specifically crafted tools for this purpose (for specific likely baremetal platforms), you will not see your whole program turn into one big assembly source file before it gets assembled then linked.

If not baremetal then there is of course the operating system layer which you certainly would not get to see as part of your source code, ultimately the C library calls that need the system will have a place where they do that, all compiled to object/lib before you use them, and the assembly sources for the operating system side is part of some other source and build process somewhere else.

old_timer
  • 69,149
  • 8
  • 89
  • 168
  • I now understand that unless working on baremetal platform, it is not generally possible to get a full assembly code that doesn't make a call to another function. Thank you for your answer. – Kinnefix Kim Jun 07 '18 at 06:42