0

I have an assembly function that is contained in a .asm file. I am trying to call it as an inline function, but when I disassemble it, every occurrence of it is a function call. I have done some research, and I read that for a function to be inline, it has to exist in every translation unit that uses it, but I don't know how to make the compiler/linker do that, or if it is even possible. If there is a way, how can it be done? I am uses some features that inline assembly does not provide, so that is not an option. I am using Visual Studio 2017.

As requested by Jesper Juhl, here is a short section of the file "Random.asm" that I am working with.

.code
_Random proc
rdseed eax
jnc PseudoRandom        ;In case of failure, go to a pseudorandom number generator
ret
_Random endp
  • 1
    If it is external, you can’t inline it. If you want to effectively paste it into your program, you can use #include. – Scott Hunter Jun 11 '18 at 19:29
  • 4
    What's wrong with a function call? –  Jun 11 '18 at 19:29
  • @NeilButterworth , nothing really. I just like to keep my code cleaner and not have unnecessary call instructions – FeeeshMeister Jun 11 '18 at 19:31
  • @ScottHunter , so how would I do that? – FeeeshMeister Jun 11 '18 at 19:32
  • 5
    Calling a function is *definitely* "cleaner" than any other solution to inline the function into your code. And vastly more readable, understandable, and maintainable (which in 99.9999% of all cases is really more important). And if the call is "unnecessary" you don't really know until you benchmark it. – Some programmer dude Jun 11 '18 at 19:34
  • 1
    Why use inline asm in the first place? I mean, *sure* there are situations where that's your only option, but those are few and far between. Can't you just write your code as normal C++ and let the compiler optimize/inline it as it sees fit? – Jesper Juhl Jun 11 '18 at 19:36
  • @JesperJuhl , I'm experimenting with RDRAND and RDSEED, and my compiler does not support those intrinsics. – FeeeshMeister Jun 11 '18 at 19:40
  • @FeeeshMeister OK. That's a fair reason. Maybe you should consider adding a [mcve] to your question :) – Jesper Juhl Jun 11 '18 at 19:43
  • 2
    @FeeeshMeister try including `#include ` for `_rdrand64_step` and `_rdseed64_step` that should definitely be supported by VS2017. In the future I'd suggest checking the [intrinsics guide](https://software.intel.com/sites/landingpage/IntrinsicsGuide/) – Mgetz Jun 11 '18 at 19:43
  • @JesperJuhl , okay. I added it. – FeeeshMeister Jun 11 '18 at 19:49
  • @Mgetz , I know I don't _need_ it, but there are certain circumstances that I prefer assembly over c++. – FeeeshMeister Jun 11 '18 at 19:55
  • @FeeeshMeister FTW the reason MSVC doesn't support inline ASM in 64bit is that it's impossible to do proper optimization on it because it can do all sorts of things that break runtime assumptions. In general intrinsics will out perform ASM because the compiler can better optimize and schedule instructions. – Mgetz Jun 11 '18 at 19:59
  • @Mgetz , I'm actually using x86 assembly, but the reason I am using external assembly is because I need control over segments, which inline assembly does not provide. – FeeeshMeister Jun 11 '18 at 20:01
  • @FeeeshMeister I think you'll find [It does](https://learn.microsoft.com/en-us/cpp/assembler/masm/dot-data) – Mgetz Jun 11 '18 at 20:02
  • @Mgetz , maybe I'm misunderstanding what a segment actually is, because I can't find any documentation that shows anything remotely similar to what I am trying to do. – FeeeshMeister Jun 11 '18 at 20:05
  • 1
    Have you tried writing your PRNG in C++ and comparing the performance with your hand-made asm version? I wouldn't be surprised to see the C++ version being faster, even in MSVC. It takes *a lot* of skill to schedule x64 instructions by hand. Oh, and you most certainly don't need to control any segments ;) – rustyx Jun 11 '18 at 20:43
  • @rustyx: *scheduling* instructions is not very important on modern x86 CPUs; aggressive out-of-order execution can find whatever instruction-level parallelism exists, unlike an in-order ARM or a P5-pentium. Intel since Sandybridge has a uop cache, so decode bottlenecks from instruction ordering are often not an issue. The trick is choosing the *right* instructions, and getting the job done with fewer total uops and keeping any loop-carried dependency chains as short as possible. And making effective use of SSE2 (or AVX2 or whatever if available)... That stuff is still hard :) – Peter Cordes Jun 11 '18 at 21:01
  • 1
    @Mgetz: MSVC dropped inline asm because [*their* implementation sucked](https://stackoverflow.com/questions/3323445/what-is-the-difference-between-asm-asm-and-asm#comment59576185_35959859) (e.g. apparently wasn't safe in functions with register args, for no real reason), and the syntax doesn't let you describe the inputs / outputs / clobbers to the compiler. GNU C inline asm works fine on all architectures, and can be safely optimized (if you get the constraints exactly correct), but it still defeats some optimizations like constant propagation. https://gcc.gnu.org/wiki/DontUseInlineAsm – Peter Cordes Jun 11 '18 at 21:05
  • @FeeeshMeister What are you trying to do with segments? Note that in 64 bit mode, segmentation is mostly unavailable. And even the thought of using nontrivial segmentation in 32 bit mode seems weird to me. – fuz Jun 12 '18 at 00:06

0 Answers0