6

I have a string literal in my program, I'm trying to create an amateur checksum to ensure the string literal was not replaced in the portable executable.

To do this, I create a hash of the string literal, and store it as a integer literal in the program. Now I have two literals, one for the string and one for the hash.

In my code, I implement the checksum by using a function that hashes the string literal in the same way, I create a new runtime hash and check that hash against the hash literal.

The problem is of course, with compiler optimizations it may pre-compute the runtime hash, and then im checking a hash literal against a hash literal and the checksum will always return true.

So I'm looking for a trick to make the compiler think the string literal is a dynamic string that could be anything, so that it will not do the constant folding optimization on the runtime hash and my code will work correctly.

StayOnTarget
  • 11,743
  • 10
  • 52
  • 81
Thomas
  • 6,032
  • 6
  • 41
  • 79
  • Use a function to calculate the hash, supply the string as a parameter. – wimh Sep 06 '15 at 17:04
  • @Wimmel you think compilers don't optimise across calls? Haha. –  Sep 06 '15 at 17:05
  • @elyse and if the function is in a separate compilation unit? Or is the linker also able to optimize that? – wimh Sep 06 '15 at 17:07
  • @Wimmel: Heard about LTO? – too honest for this site Sep 06 '15 at 17:08
  • 1
    Let the hashing function take two hours to complete, the compiler will think twice before precomputing... [More seriously, unless the hash is really trivial, I don't believe in that scenario.] –  Sep 06 '15 at 17:10
  • 1
    @Wimmel: LLVM can even optimize through calls through function pointers sometimes nowadays, but if you do obscure enough trickery with the pointers you can probably confuse it. The volatile trick is the way to go to prevent optimizations. – Chris Beck Sep 06 '15 at 17:12
  • What make you think that other people would even try to patch your executable? It is much more probable that they won't care and that very few people (and perhaps nobody) would use your program. – Basile Starynkevitch Sep 06 '15 at 17:15
  • 1
    NVM. after reading an answer, I now see what you mean by *'m trying to create an amateur checksum to ensure the string literal was not replaced in the portable executable*. I wonder if they can replace the string literal, why wouldn't they also replace the hash and fool your entire check. – Johannes Schaub - litb Sep 06 '15 at 17:33
  • If you are seriously concerned about security, you'd probably be better off with a third-party tool to secure your exe from modification. Any simple solutions you use in-language could probably be circumvented by anyone who seriously wants to. – Neil Kirk Sep 06 '15 at 18:04
  • @NeilKirk: what makes you believe that third-party tools won't be circumvented? – Basile Starynkevitch Sep 06 '15 at 18:12
  • 3
    @BasileStarynkevitch Nothing, but they are less likely to be than a homebrew solution by a non-expert. – Neil Kirk Sep 06 '15 at 18:23

1 Answers1

9

You might perhaps declare the string literal as a const volatile, e.g.

const volatile char myliteral[] = "some literal string";

and you could also compute the hash at build time, e.g. have something in your build procedure to extract appropriate strings and separately compute the hash.

At last, if the string and its hash are in two different translation units (e.g. in file1.c and file2.c) you need link-time optimization to make the inlining happen at build time. With current GCC (i.e. GCC 4.9 or 5) you need to explicitly pass -flto at compile time and at link time to get link-time optimization, so if you don't do explicitly that (e.g. with CC=gcc -flto -O2 in your Makefile), it won't happen.

BTW, you might checksum the entire executable, or an entire shared library, or some given object file. Details are OS specific. On Linux see dlopen(3), dlsym(3), dladdr(3), dl_iterate_phdr(5), elf(5), proc(5)

Also, you could hash some random suffix substring of the initial literal (e.g. hash myliteral+random()%strlen(myliteral) at runtime) and keep and compare to the constant array of such partial hashes. The compiler is very unlikely to inline all of that!

I actually believe that it is not a real issue in practice: nobody would care about your executable, and nobody would spend time to decompile it.

BTW, you could generate a separate __timestamp.c file containing timestamp and checksum information (and I am doing that in my bismon project in summer 2018), and link it with your executable.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 1
    Closely related, you can cast the result of the checksum function call itself to `volatile void` to ensure that it doesn't get optimized away. `static_cast(check_checksum(literal, hash));` You could also compute the hash using constexpr functions, if you are in C++11 – Chris Beck Sep 06 '15 at 17:06
  • any resources/tools you know of where I can learn how to checksum entire executable or shared library for windows? – Thomas Sep 06 '15 at 17:16
  • I never used Windows, but (at least on Linux) during its build [GCC](http://gcc.gnu.org/) is computing its own checksum, so you could look inside its `Makefile`-s – Basile Starynkevitch Sep 06 '15 at 17:17
  • Is the only way to create a test to ensure the volatile method worked is to tamper with executable? – Thomas Sep 06 '15 at 17:31
  • Did I really just see const volatile? I think my brain just had an off by 1 error. I would probably just put the const in a separate compilation unit, but +1 for making me go read the standards. What next, a static extern function? – technosaurus Aug 20 '18 at 19:28
  • AFAIK `const volatile` is licit; intuitively the compiler won't touch the constant, but should expect external causes to change it (so should not cache it in registers for a long time) – Basile Starynkevitch Aug 20 '18 at 19:29