4

I'm programming in a "high level language" (Nim), and I have to "go down to C" for performance reasons. I would like to do something like this:

/* Pseudocode */
include <VersionHelpers.h>
/* ...*/
if (isWindows8OrGreater()) {
    /** use InterlockedIncrementNoFence64() */
} else {
    /** use InterlockedIncrement64() ;( */
}

But I was told it would crash on Windows 7, because if I reference InterlockedIncrementNoFence64(), it must be available, even if I don't try to call it.

I'm writing a multi-threaded app, and communication is implemented with "messages" (even within the same thread). This is going to be called for every message, so a memory fence would have a significant performance impact.

I'm used to Java, where things like that work, and idk how this is done in C.

Sebastien Diot
  • 7,183
  • 6
  • 43
  • 85
  • 2
    Do the relevant check(s) on compile-time using pre-processor macros not on run-time using functions. – alk Dec 28 '17 at 17:10
  • So, you mean the solution is to *build different versions* based on the target OS? If that is the right way to do this, why not post it as answer? – Sebastien Diot Dec 28 '17 at 17:19
  • 1
    Use the [Linker Support for Delay-Loaded DLLs](https://msdn.microsoft.com/en-us/library/151kt790.aspx). No need to build different versions, and failure to resolve an import at link-time is reported for you to handle. Always be sceptical, when a Linux developer suggests solutions about linking in Windows. They are generally wrong (as in this case). – IInspectable Dec 28 '17 at 17:27
  • @IInspectable: Although you propose a more general approach which generally is a good thing, I do not like the word "*generally*" in your comment, as it's always a matter of the use case, which as per the question is not quite clear. – alk Dec 28 '17 at 18:39
  • 1
    @IInspectable While delay-load is useful in general, it doesn't work in this case because you cannot delay-load kernel32. But it's probably also irrelevant in this case because the implementation is provided by [a compiler intrinsic](https://msdn.microsoft.com/en-us/library/windows/desktop/2ddez55b(v=vs.85).aspx). There is no operating system function to import. – Raymond Chen Dec 28 '17 at 18:51
  • macOS has “weak linking”, in which a symbol will be set to NULL if it is not defined when the program is linked. Then the code can use `if (FunctionFoo != NULL) FunctionFoo(parameters…) else AlternativeCode;`. [This answer](https://stackoverflow.com/a/11529277/298225) shows an undocumented way to achieve a similar result in Visual Studio. Can a Windows programmer tell us if there is a supported way? Or perhaps the [accepted answer](https://stackoverflow.com/a/2290838/298225) suffices? – Eric Postpischil Dec 28 '17 at 19:29
  • in general you can use variable, which hold pointer to function, and resolve this pointer yourself, in run-time. the `alternatename` or `__declspec(selectany)` will be not work here, because symbol resolved at linking time, but not at run time, which you need. however in case `InterlockedIncrementNoFence64` - this is platform depended. for x86/x64 - this is simply macro - `#define InterlockedIncrementNoFence64 InterlockedIncrement64` – RbMm Dec 28 '17 at 20:04
  • so in case x86/x64 this is not depended from windows version on all - no any imported calls – RbMm Dec 28 '17 at 20:05
  • 1
    @RaymondChen - *you cannot delay-load kernel32* only because say *link.exe* compare delay load dll name with some hard-coded, well known names (in function `bool FInvalidDelayLoadDll(const IMAGE *, PCWSTR);`). if patch *link.exe* we can delay load any dll, including kernel32, ntdll, kernelbase, etc. in some case this can be useful. but of course not in this concrete case - when need only 1-2 function resolve in run-time. especially if this really not a imported functions at all but really macros resolved to a compiler intrinsic, as you note – RbMm Dec 28 '17 at 20:37
  • @EricPostpischil: The supported way to accomplish run-time dynamic linking is to use [run-time dynamic linking](https://msdn.microsoft.com/en-us/library/windows/desktop/ms685090.aspx). Although it doesn't matter in this case, as there are no imports. The call is implemented as a compiler intrinsic, as pointed out by [Raymond Chen](https://stackoverflow.com/questions/48011474/how-do-i-call-a-c-function-which-is-only-available-in-specifc-os-versions?noredirect=1#comment82994181_48011474). – IInspectable Dec 28 '17 at 20:58
  • 2
    Making unsupported patches to your toolchain is not generally considered a good engineering practice. It doesn't scale, it isn't portable, and someday you will lose a hard drive or reinstall/upgrade Visual Studio (or port to a new architecture) and lose your patch. The linker is trying to save you from yourself: `GetProcAddress` is in kernel32. So how could you delay-load `GetProcAddress`? – Raymond Chen Dec 28 '17 at 21:55
  • @RaymondChen - exist also `LdrGetProcedureAddress` in `ntdll.dll`. *unsupported* - and so what ? of course when we download new *link.exe* - need again patch this new version. however in some extreme case delayload kernel32 can be useful (say bootstrap dll which injected to process from driver on very early stage (on load kernel32 notification) and it can static import only from ntdll.dll). however all this is general words. in concrete case - we really have compiler intrinsic. and even if was 1-2 imported api - much more effective here explicit define function pointers – RbMm Dec 28 '17 at 22:19
  • 2
    I think we should avoid recommending unsupported techniques. (Patching each version of link.exe is not sustainable. Or do you have a patch for the arm64 linker too?) – Raymond Chen Dec 28 '17 at 22:36
  • Is patching the linker even legal? – IInspectable Dec 28 '17 at 22:45
  • @RaymondChen - of course i patch amd64 linker too. (for arm64 target i never build) and this is note recommendation but general note - possible implement delay load for any dll. for concrete question case - `InterlockedIncrementNoFence64` not a imported function. and in case we have only several imported functions - more easy declare it as function pointers or explicit `__imp_` symbol in assembler – RbMm Dec 28 '17 at 22:50
  • however for build x64 code msvs call not x64 link.exe version but x86 link.exe from `x86_amd64` subfolder. for build arm target also called really x86 link.exe from `x86_arm` subfolder. may be from `amd64_arm`. any way here need patch not target PE, but linker itself, which is x86 or amd64 binary (on desktop) – RbMm Dec 28 '17 at 23:04
  • @IInspectable - may be debugging and research it also *not legal* ? – RbMm Dec 28 '17 at 23:05
  • 2
    @RbMm It seems that your answer is really an answer that works just for you and not for others, since it's tightly coupled to your expertise (patching the linker), your development environment (you don't build for anything other than x86 and x64), and your customer base (who doesn't care that the products they are using rely on unsupported techniques). The answer is not useful in general, because most people do not have that same combination of skills, targets, and customers. – Raymond Chen Dec 28 '17 at 23:59
  • 1
    @RaymondChen - but what i say about delay load trick - this is not answer here at all. this is only general note. of topic. – RbMm Dec 29 '17 at 00:04
  • @RbMm You can very much "delay-load" (intentionally in quotes, it's been loaded anyway already) `kernel32.dll` and those others that are conventionally forbidden. Simply create an import lib for a fake library name, say `kernel32.dld`, then override `__pfnDliNotifyHook2` to look for that name during `dliNotePreLoadLibrary` and map your fake names to real names somehow ... return the correct `HMODULE`, watch out for failures when doing function lookups on downlevel OS versions and you're golden. No voodoo, no changes to the toolchain, just the documented methods used in a creative fashion – 0xC0000022L Jan 10 '23 at 23:29
  • @0xC0000022L posible. or possible use *windowsapp.lib* (*mmos*, *mincore*) . but delay load not designed for resolve problem with not available api. this is optimization on first place – RbMm Jan 11 '23 at 06:59
  • @RbMm it's not designed for it, agreed. However, I think it's much more elegant and feels more natural to the developer than using an approach such as `LoadLibrary` followed by `GetProcAddress` sprinkled throughout your code. To the developer delay-loading (outside of customizing the helper callbacks) _feels_ as if the functions had been statically imported. – 0xC0000022L Jan 11 '23 at 10:49
  • @0xC0000022L but how handle case, when api not available ? if take to account, that we already call it. need how minimum implement stub for every api, which return error or alternative implementation. for - the best use `__imp_func` variables. in this case api call look like usual, but we can select - call api or no. In any case, different solutions are possible, depending on the specific situation. – RbMm Jan 11 '23 at 13:42
  • @RbMm But that's pretty much what delay-loading does through (shall we call it) linker magic. The case how you handle if an API is not available is completely left to you. But when looking up a function through the helper callbacks nothing prevents you from offering a fallback with the exact same function signature. Not sure where you see the problem there. But agreed, different problems call for different solutions. My comment was mostly regarding Raymond's debate with you and trying to point out that there are ways to achieve this _within_ the boundaries of what we are handed. – 0xC0000022L Jan 11 '23 at 13:47
  • 1
    @0xC0000022L - exist different ways. delay-load from my option - only optimization, designed for not load at start all potential used dlls. it can be not used at all some time later, so not need load it at begin. and despite posible use this and for such case, it not the best way from my option - delay-load - is per dll, not per api. and need always some alternate implementation for api, or handle exception. – RbMm Jan 11 '23 at 13:57

1 Answers1

3

in concrete case InterlockedIncrementNoFence64 not imported from any dll but implemented using a compiler intrinsic on most platform (x86 - via _InterlockedCompareExchange64, amd64 - _InterlockedIncrement64, arm/arm64 - _InterlockedIncrement64_nf). so code with this concrete call will be work on any windows version (in win7 as well of course).

in more general case, if we need use several functions, which is not exported on all os - we can declare it as function pointer and resolve in run-time. for example let take LoadPackagedLibrary api

we can declare function pointer:

HMODULE (WINAPI * LoadPackagedLibraryAddr)(
                                   _In_       LPCWSTR lpwLibFileName,
                                   _Reserved_ DWORD   Reserved
                                   );

resolve it in run-time:

*(void**)&LoadPackagedLibraryAddr = GetProcAddress(
    GetModuleHandleW(L"kernel32"), "LoadPackagedLibrary");

and use:

if (LoadPackagedLibraryAddr)
{
    LoadPackagedLibraryAddr(L"***",0);
}
else {...}

another possible way for function declared with __declspec(dllimport) (most windows api declared with this prefix) use next syntax:

extern "C" {
    PVOID __imp_LoadPackagedLibrary;
}

#ifdef _M_IX86 
__pragma(comment(linker, "/alternatename:__imp__LoadPackagedLibrary@8=___imp_LoadPackagedLibrary"))
#endif

__imp_LoadPackagedLibrary = GetProcAddress(
    GetModuleHandleW(L"kernel32"), "LoadPackagedLibrary");

#pragma warning(disable : 4551)
if (LoadPackagedLibrary)//if (__imp_LoadPackagedLibrary)
{
    LoadPackagedLibrary(L"***",0);
}

however both ways generate absolute same binary code, only use different syntax.

note that not need query windows version at all, need simply try get pointer to api. or we get it and can use, or no.

the way with __declspec(selectany) or #pragma comment(linker, "/alternatename:_pWeakValue=_pDefaultWeakValue")

will be not work here because with this we resolve symbols at link time (result will be common for all windows version), but we actually need resolve symbol at run-time.

RbMm
  • 31,280
  • 3
  • 35
  • 56