5

What are the main existing approaches to hide the value of literals in code, so that they are not easily traced with just an hexdumper or a decompiler?

For example, instead of coding this:

    static final int MY_VALUE = 100;

We could have:

    static final int MY_VALUE = myFunction1();

    private int myFunction1(){
        int i = 23;
        i += 8 << 4;
        for(int j = 0; j < 3; j++){
            i-= (j<<1);
        }
        return myFunction2(i);
    }

    private int myFunction2(int i){
        return i + 19;
    }

That was just an example of what we're trying to do. (Yes, I know, the compiler may optimize it and precalculate the constant).

Disclaimer: I know this will not provide any aditional security at all, but it makes the code more obscure (or interesting) to reverse-engineer. The purpose of this is just to force the attacker to debug the program, and waste time on it. Keep in mind that we're doing it just for fun.

Mister Smith
  • 27,417
  • 21
  • 110
  • 193
  • Your example is sufficient for hiding the actual value. The other half of the solution is to strip the executable to remove symbols, making it harder to reverse engineer (but also harder to debug), or at least use function names that do not show that you are attempting to hide something. – Simon C Sep 29 '11 at 08:38
  • Your example code is complex but unconditional. Adding conditions on (apparently) external factors might be a useful additional obfuscation technique. Like set a global to zero deep inside some initialization code, then check whether it's even and if not, do a different (useless, never reached) calculation on `i`. – tripleee Oct 10 '11 at 08:42

3 Answers3

3

Since you're trying to hide text, which will be visible in the simple dump of the program, you can use some kind of simple encryption to obfuscate your program and hide that text from prying eyes.

Detailed instuctions:

  1. Visit ROT47.com and encode your text online. You can also use this web site for a more generic ROTn encoding.
  2. Replace contents of your string constants with the encoded text.
  3. Use the decoder in your code to transform the text back into its original form when you need it. ROT13 Wikipedia article contains some notes about implementation, and here is Javascript implementation of ROTn on StackOverflow. It is trivial to adapt it to whatever language you're using.

Why use ROT47 which is notoriously weak encryption?

In the end, your code will look something like this:

decryptedData = decryptStr(MY_ENCRYPTED_CONSTANT)
useDecrypted(decryptedData)

No matter how strong your cypher, anybody equipped with a debugger can set a breakpoint on useDecrypted() and recover the plaintext. So, strength of the cypher does not matter. However, using something like Rot47 has two distinct advantages:

  1. You can encode your text online, no need to write a specialized program to encode your text.
  2. Decryption is very easy to implement, so you don't waste your time on something that does not add any value to your customers.
  3. Anybody reading your code (your coworker or yourself after 5 years) will know immediately this is not a real security, but security by obscurity.
  4. Your text will still appear as gibberish to anyone just prying inside your compiled program, so mission accomplished.
Community
  • 1
  • 1
haimg
  • 4,547
  • 35
  • 47
  • I was thinking on text rather than ints. Rot13 seems to me too standard. – Mister Smith Oct 05 '11 at 07:23
  • Since you cannot really hide these strings from someone equipped with a debugger, you're hiding them from a casual curious eyes. So, any weak encryption will do there. Don't want a standard Rot13? Add an extra XOR or something... – haimg Oct 05 '11 at 13:39
  • I think this is the only generic solution. Applying a transformation to text literals, such as RotN or similar. Same with ints. – Mister Smith Oct 10 '11 at 07:29
3

Run some game of life variant for a large number of iterations, and then make control flow decisions based on the final state vector.

If your program is meant to actually do something useful, you could have your desired branches planned ahead of time and choose bits of the state vector to suit ("I want a true here, bit 17 is on, so make that the condition..")

phs
  • 10,687
  • 4
  • 58
  • 84
0

You could also use some part of compiled code as data, then modify it a little. This would be hard to do in a program executed by virtual machine, but is doable in languages like asm or c.

mateusz.fiolka
  • 3,032
  • 2
  • 23
  • 24
  • No, only read some operation instruction or argument (hiding data in a program was the question). Self modification of code will also in obfuscation but it's not related to the question. – mateusz.fiolka Sep 29 '11 at 08:44