1

I have a library of C functions that I optimized internally using SIMD intrinsics. These functions all look something like this:

void add_array(...) {
#if defined(USE_SIMD)
  // SIMD code here ...
#else 
  // Scalar code here ...
#end
}

contained in individual files, so a add.c file here, for example.

Now, I would like to ensure that both variants of that function are functionally equivalent. I found that simply generating random (but valid) input for both variants and comparing the results suffices for my application. I think this is called Monkey Testing. The scalar code (or rather its output values) acts as a golden reference. (Because of the SIMD intrinsics formal verification is not an option.)

However, I have not found a scalable and sustainable way to run these tests from a C testing framework. So far, I manually copied the vector code into an extra vector function add_array_vector() and then ran them both after another from the same main function test harness; comparing the "golden reference output" from the add_array() function with the values from the add_array_vector() variant. But this approach does not scale, since I have more than 100 of these functions that all use the #if #else approach internally. Since I have to run all code in a simulator (or on a bare-metal embedded device), I also can't interact with a file system. I need a single test binary that contains all tests and test data. It has to report its results via a printf (UART) call.

What I see as my only option is to compile the functions twice: Once without the USE_SIMD and once with the USE_SIMD defined. Then I would need to link these two function variants into the same main (my test harness). However, how do I ensure that both variants have different function names? Is there a way I can "name mangle" the USE_SIMD define into the function name? And how would I link them?

Maybe I am completely on the wrong track here and there is a far simpler way to solve this. I surely can't be the first person who came across this core issue: ensuring that two variants of the same C (or C++) function are functionally equivalent.

Any help is greatly appreciated. Thanks

EDIT: I can't afford to print the numeric results via printfs (or UART) as they are a serious bottleneck in this randomized bruteforce approach. They dramatically reduce (multiple order of magnitude) the number of iterations / tests I can run per second. Printing the final outcome or an error, if one occurs, is fine. Printing every numerical test result value for "external validation" is not sustainable.

Fabian
  • 312
  • 3
  • 13
  • Shared or static library? – Mad Physicist Jul 02 '22 at 23:03
  • Static. I am using CMake `add_library(${CMAKE_PROJECT_NAME} STATIC)` – Fabian Jul 02 '22 at 23:05
  • Write one set of tests with a deterministically seeded rng. Compile once with the define and once without. Run both programs. Run cmp on the output – Mad Physicist Jul 02 '22 at 23:05
  • Exactly, this is the approach I would have taken if was running my code natively. However, I have this one additional constrained: I am running the code inside a simulator (or on an embedded device). Thus, I can only run a single program that needs to handle everything internally. I would need to transfer all test results via printfs from the simulator (or via UART from the emedded device). – Fabian Jul 02 '22 at 23:08
  • Then run the simulator twice and pipe the output to UART, i2c, spi, or whatever you're using – Mad Physicist Jul 02 '22 at 23:09
  • I would need to transfer all test results via printfs from the simulator (or via UART from the embedded device). This would require to parse the output of the simulator via an additional script. And it would drastically limit the number of tests I can run per second. The printfs (UART) are a serious bottleneck for this randomized "bruteforce" approach. – Fabian Jul 02 '22 at 23:11
  • Unless your function names are different in the two cases, you're stuck. If you're willing to name the functions differently and create a macro to rename the simd version to the scalar version at compile time, you can run each test twice in the same executable – Mad Physicist Jul 02 '22 at 23:15
  • 1
    `What I see as my only option is to compile the functions twice` Do that. `Then I would need to link these two function variants into the same main` Yes. You have two completely separate programs, with different cofngiuration. `I am using CMak` Write a `CMAKE_CROSSCOMPILING_EMULATOR` that runs an executable, do `add_library(your_lib_config_1` and `add_library(your_lib_config_2` etc. for each configuration you want to test, and create separate executables for every configuration combination. – KamilCuk Jul 02 '22 at 23:21
  • @Mad Physicist I see. This is certainly not an ideal, but a viable option. Thank you very much! – Fabian Jul 02 '22 at 23:21
  • This can [easily] be done with a script that scans all source files looking for all functions that have `#if defined(USE_SIMD)` in them. It can then generate (e.g.) `-Dadd_array=add_array_simd` (or put `#define add_array add_array_simd` into a `.h` file and force the given files to be compiled twice, once normally, and once with `-DUSE_SIMD` It then generates a `test.c` that calls all the versions of the files. – Craig Estey Jul 02 '22 at 23:21
  • 1
    @CraigEstey. That sounds like a lot of error prone work. Just give simd functions a different name and alias them in the header – Mad Physicist Jul 02 '22 at 23:24
  • @MadPhysicist I would do: `void add_array_std() { ... } void add_array_simd() { ... }` possibly with `inline void add_array() { #ifdef USE_SIMD add_array_simd(); #else add_array_std(); #endif }` But, the amount of editing necessary is comparable. As to "error prone", I would say no because I've done such things a zillion times successfully. The editing could be aided with the use of `unifdef` to generate two versions. AFAICT, OP's route involves a similar amount of [script] work no matter which option is chosen. – Craig Estey Jul 02 '22 at 23:38
  • @CraigEstey. I would do `#ifdef USE_SIMD \n #define SUFFIX _simd \n #else \n #define SUFFIX \n #endif` in the private header, then name the function `void add_array##SUFFIX(...)` and in the public header, do `#ifdef USE_SIMD \n #define add_array add_array_simd \n #endif` – Mad Physicist Jul 03 '22 at 00:01

0 Answers0