How to obfuscate string constants?

Question

We have an application which contains sensitive information and I'm trying my best to secure it. The sensitive information includes:

The main algorithm
The keys for an encryption/decryption algorithm

I've been looking at Obfuscating the code but it doesn't seem to help much as I can still decompile it. However, my biggest concern is that the keys used for encryption of serial numbers etc are clearly visible when you decompile the code, even when it's Obfuscated.

Can anyone suggest how I can secure these strings?

I realise one of the methods might be to remove any decryption from the application itself, while this may be possible in part, there are some features which have to use encryption/decryption - mainly to save a config file and to pass an 'authorisation' token to a DLL to perform a calculation.

You are asking for help building a security system without specifying the most important thing: **what is the threat you are attempting to secure the code against?** Describe the threat. That said, it sounds like you are going down a very wrong path here. If you don't want the user to have your precious secret algorithms **don't sell them a device that implements that algorithm**. Keep the algorithm on your servers. — Eric Lippert, May 16 '11 at 14:08

Mark Booth · Answer 1 · 2020-03-02T15:28:52.340

There are ways to do what you want, but it isn't cheap and it isn't easy.

Is it worth it?

When looking at whether to protect software, we first have to answer a number of questions:

How likely is this to happen?
What is the value to someone else of your algorithm and data?
What is the cost to them of buying a license to use your software?
What is the cost to them of replicating your algorithm and data?
What is the cost to them of reverse engineering your algorithm and data?
What is the cost to you of protecting your algorithm and data?

If these produce a significant economic imperative to protect your algorithm/data then you should look into doing it. For instance if the value of the service and cost to customers are both high, but the cost of reverse engineering your code is much lower than the cost of developing it themselves, then people may attempt it.

So, this leads on to your question

How do you secure your algorithm and data?

Discouragement

Obfuscation

The option you suggest, obfuscating the code, messes with the economics above - it tries to significantly increase the cost to them (5 above) without increasing the cost to you (6) very much. The research by the Center for Encrypted Functionalities has done some interesting research on this. The problem is that as with DVD encryption it is doomed to failure if there is enough of a differential between 3, 4 and 5 then eventually someone will do it.

Detection

Another option might be a form of Steganography, which allows you to identify who decrypted your data and started distributing it. For instance, if you have 100 different float values as part of your data, and a 1bit error in the LSB of each of those values wouldn't cause a problem with your application, encode a unique (to each customer) identifier into those bits. The problem is, if someone has access to multiple copies of your application data, it would be obvious that it differs, making it easier to identify the hidden message.

Protection

SaaS - Software as a Service

A more secure option might be to provide the critical part of your software as a service, rather than include it in your application.

Conceptually, your application would collect up all of the data required to run your algorithm, package it up as a request to a server (controlled by you) in the cloud, your service would then calculate your results and pass it back to the client, which would display it.

This keeps all of your proprietary, confidential data and algorithms within a domain that you control completely, and removes any possibility of a client extracting either.

The obvious downside is that clients are tied into your service provision, are at the mercy of your servers and their internet connection. Unfortunately many people object to SaaS for exactly these reasons. On the plus side, they are always up to date with bug fixes, and your compute cluster is likely to be higher performance than the PC they are running the user interface on.

This would be a huge step to take though, and could have a huge cost 6 above, but is one of the few ways to keep your algorithm and data completely secure.

Software Protection Dongles

Although traditional Software Protection Dongles would protect from software piracy, they wouldn't protect against algorithms and data in your code being extracted.

Newer Code Porting dongles (such as SenseLock^†) appear to be able to do what you want though. With these devices, you take code out of your application and port it to the secure dongle processor. As with SaaS, your application would bundle up the data, pass it to the dongle (probably a USB device attached to your computer) and read back the results.

Unlike SaaS, data bandwidth would be unlikely to be an issue, but performance of your application may be limited by the performance of your SDP.

^{† This was the first example I could find with a google search.}

Trusted platform

Another option, which may become viable in the future is to use a Trusted Platform Module and Trusted Execution Technology to secure critical areas of the code. Whenever a customer installs your software, they would provide you with a fingerprint of their hardware and you would provide them with a unlock key for that specific system.

This key would would then allow the code to be decrypted and executed within the trusted environment, where the encrypted code and data would be inaccessible outside of the trusted platform. If anything at all about the trusted environment changed, it would invalidate the key and that functionality would be lost.

For the customer this has the advantage that their data stays local, and they don't need to buy a new dongle to improve performance, but it has the potential to create an ongoing support requirement and the likelihood that your customers would become frustrated with the hoops they had to jump through to use software they have bought and paid for - losing you good will.

Conclusion

What you want to do is not simple or cheap. It could require a big investment in software, infrastructure or both. You need to know that it is worth the investment before you start along this road.

Do you lock your house door when you leave for work ? Your argument is that it's useless to lock your house door because anyway someone can knock it down and get in. Whilst you can always break in, it's not a reason for make the potential intruder's life easy. — Damien, Sep 14 '22 at 13:05
Leaving propriety information in plaintext is like leaving your door unlocked. Obfuscation is like locking your door, but leaving the key under the door mat, or in the plant pot. Everything an intruder needs to gain access without breaking to your house is right there. SAS or TP on the other hand have the potential to make your door so secure that an intruder would need longer than the lifetime of the universe to bash it down with current technology! — Mark Booth, Sep 16 '22 at 10:02

Tom Gullen · Accepted Answer · 2011-05-16T13:43:40.510

All efforts will be futile if someone is motivated enough to break it. No one has managed to figure this out yet, even the biggest software companies.

I'm trying my best to secure it

I'm not saying this as a scathing criticism, just you need to be aware of what your trying to achieve is currently assumed to be impossible.

Obfuscation is security through obscurity, it does have some benefit as it will deter the most incompetent of hacker attempts, but largely it is wasted effort that could perhaps be better spent in other areas of development.

In answer to your original question, you are going to run into problems with intelligent compilers, they might automatically piece together the string into the compiled application removing some of your obfuscation efforts as a compilation optimisations. It would be hard to maintain as well, so I would reconsider your risk analysis model and perhaps resign yourself to the fact it can be cracked and if it has any value probably will be.

+1, also even if you obfuscate it in the assembly it will be come trivial to pull out with a debugger unless you start jumping through even more hoops (all of which are normally simple to work around) — ShuggyCoUk, May 16 '11 at 15:30
Given recent developments in trusted execution technology, I don't believe that this answer is strictly speaking correct these days. I have updated [my answer](http://stackoverflow.com/a/6018904/42473) with details. — Mark Booth, Oct 01 '15 at 13:52

score 14 · Answer 3 · answered Oct 02 '13 at 05:56

14

I recently read a very simple solution to OP.

Simple declare your constants as readonly string, not const string. That simple. Apparently const variables get written to a stack area in the binary but written as plain text whereas readonly strings get added to the constructor and written as a byte array instead of text.

I.e. If you search for it, you won't find it.

That was the question, right?

answered Oct 02 '13 at 05:56

MyBad Studios

151
1
2

1

Agreed. I've used this technique when obfuscating, and you no longer see constants within the decompiled code. While the string is obviously still *somewhere*, it's a lot trickier to find, deterring casual hackers and nosy devs. Every little helps. – Andrew Stephens Jan 22 '16 at 12:05
2

This is an intredasting tidbit of knowledge. Take an upvote. – Krythic May 22 '16 at 04:10
This is a very simple tip with zero cons that I can think of. – rollsch Feb 24 '17 at 01:13

vgru · Answer 4 · 2011-05-16T13:53:02.840

Using a custom algorithm (security through obscurity?), combined with storing the key inside the application, is simply not secure.

If you are storing some kind of a password, then you can use a one-way hashing function to ensure that decrypted data is unavailable anywhere in your code.

If you need to use a symmetric encryption algorithm, use a well known and tested one, like AES-256. But the key obviously cannot be stored inside your code.

[Edit]

Since you mentioned encryption of serial numbers, I believe you a one-way hashing function (like SHA-256) would really suit your needs better.

The idea is to hash your serial numbers during build time into their hashed representations, which cannot be reversed (SHA-256 is considered to be a pretty safe algorithm, compared to, say, MD5). During run time, you only need to apply the same hash function to the user input, and compare hashed values only. This way none of the actual serial numbers are available to the attacker.

score 5 · Answer 5 · answered May 16 '11 at 13:59

@Tom Gullen have given a proper answer.

I merely got some suggestions on how you can make it harder for the users to access your keys and algorithm.

As for the algorithm: Do not compile your algorithm at compile time, but at runtime. To be able to do this you need to specify an interface which contains the methods for the algorithm. The interface is used to run it. Then add the source code for the algorithm as an encrypted string (embedded resource). Decrypt it at runtime and use CodeDom to compile it into a .NET class.

Keys: The usual way is to store spread parts of your key in different places in the application. Store each part as byte[] instead of string to make it a bit harder to find them.

If all your users have an internet connection: Fetch the algorithm source code and the keys using SSL instead.

Note that everything will be pieced together at runtime, anyone with a bit of more knowledge can inspect/debug your application to find everything.

you could also store the 'secret' assemblies themselvs as encrypted embedded resources, you can them load them in runtine as byte arrays — aL3891, May 16 '11 at 20:27

score 1 · Answer 6 · answered May 16 '11 at 13:38

1

i dont think you can easily obfuscate string constants, so if possible, dont use them :) you can use assembly resources instead, those you can encrypt however you want.

answered May 16 '11 at 13:38

aL3891

6,205
3
33
37

score 1 · Answer 7 · answered May 16 '11 at 13:46

1

Depends what you're trying to do but can you use asymmetric encryption? That way you only need to store public keys with no need to obfuscate them.

answered May 16 '11 at 13:46

Tim Rogers

21,297
6
52
68

Asymmetric encryption would imply that the public key must ship with the product, and therefore must be the key used to decrypt the program. You could guarantee authenticity of your program this way, but you can't stop someone from decoding it. – riwalk Aug 31 '11 at 12:37