1

Will any encryption scheme safely allow me to encrypt the same integer repeatedly, with different random material prepended each time? It seems like the kind of operation that might get me in hot water.

I want to prevent spidering of items at my web application, but still have persistent item IDs/URLs so content links don't expire over time. My security requirements aren't really high for this, but I'd rather not do something totally moronic that obviously compromises the secret.

// performed on each ID before transmitting item search results to the client
public int64 encryptWithRandomPadding(int32 id) {
    int32 randomPadding = getNextRandomInt32();
    return encrypt(((int64)randomPadding << 32) + id), SECRET);
}

// performed on an encrypted/padded ID for which the client requests details
public int32 decryptAndRemoveRandomPadding(int64 idToDecrypt) {
    int64 idWithPadding = decrypt(idToDecrypt, SECRET);
    return (int32)idWithPadding;
}

static readonly string SECRET = "thesecret";

Generated IDs/URLs are permanent, the encrypted IDs are sparsely populated (less than 1 in uint32.Max are unique, and I could add another constant padding to reduce the likelyhood of a guess existing), and the client may run the same search and get the same results with different representative IDs each time. I think it meets my requirements, unless there's a blatant cryptographic issue.

Example:

 encrypt(rndA + item1)   -> tokenA
 encrypt(rndB + item1)   -> tokenB
 encrypt(rndC + item2)   -> tokenC
 encrypt(rndD + item175) -> tokenD

Here, there is no way to identify that tokenA and tokenB both point to identical items; this prevents a spider from removing duplicate search results without retrieving them (while retrieving increments the usage meter). Additionally, item2 may not exist.

Knowing that re-running a search will return the same int32 padded multiple ways with the same secret, can I do this safely with any popular crypto algorithms? Thanks, crypto experts!

note: this is a follow-up to a question that didn't work out as I'd hoped: Encrypt integer with a secret and shared salt

Community
  • 1
  • 1
shannon
  • 8,664
  • 5
  • 44
  • 74
  • If the IDs do not change or expire, anyone can simply remember the ones they've seen and you can't make them forget. You can assign IDs randomly out of a large set of items to make them hard to list, but that will not make it hard to remember them. You can, of course, limit how many IDs can be requested from one IP during any given time period. Does that help any? – Qsario Jul 29 '12 at 04:39
  • @qsario: Thank you. I'm actually ok with people remembering the ones they've seen. Because millions of URLs will represent the same item, and millions of URLs will be invalid, the URL doesn't help in spidering my content. I do actually plan, in addition, to implement a daily quota for inspecting items. – shannon Jul 29 '12 at 05:01

1 Answers1

2

If your encryption is secure, then random padding makes cracking neither easier nor harder. For a message this short, a single block long, either everything is compromised or nothing is. Even with a stream cipher, you'd still need the key to get any further; the point of good encryption is that you don't need extra randomness. Zero padding or other known messages at least a block long at the beginning are obviously to be avoided if possible, but that's not the issue here. It's pure noise, and once someone discovered that, they'd just skip ahead and start cracking from there.

Now, in a stream cipher, you can add all the randomness in the beginning and the later bytes will still be the same with the same key, don't forget that. This only actually does anything at all for a block cipher, otherwise you'd have to xor the random bits into the real value to get any use out of it.

However, you might be better off using a MAC as padding: with proper encryption, the encrypted mac won't give any information away, but it looks semi-randomish and you can use it to verify that there were no errors or malicious attacks during decryption. Any hash function you like can create the MAC, even a simple CRC-32, without giving anything away after encryption.

(A cryptographer might find a way to shave a bit or two off due to the relatedness, will tons of plaintexts if they knew beforehand how they were related, but that's still far beyond practicality.)

As you asked before, you can safely throw in an unecrypted salt in front of every message; a salt can only compromise an encrypted value if the implementation is broken or the key compromised, as long as the salt is properly mixed into the key, particularly if you can mix it into the expanded key schedule before decryption. Modern hash algorithms with lots of bits are really good at that, but even mixing into a regular input key will always have the same security as the key alone.

SilverbackNet
  • 2,076
  • 17
  • 29
  • The randomness isn't intended to aid the *encryption*, but instead to 1) prevent programmatic determination of whether an item in search results has already been seen 2) cause multiple seemingly random URLs represent the same item ID. These factors mean a spidering tool can not identify if the content at a URL has already been spidered without requesting the content (which will then be addressed by a daily quota). My concern is that the multiple representations of the same content with different padding in the block might seriously weaken something. – shannon Jul 29 '12 at 05:06
  • If I use any constant value (such as MAC address) as the only padding, the abuser with the spidering application can just inspect the URLs returned in the search results to determine if they've already been seen. – shannon Jul 29 '12 at 05:10
  • I believe it would only weaken the encryption if they could easily determine that they were actually the same. In this case, you could probably just return a new GUID each time and map each to a real ID in your table; no need for encryption at all. You're the only one who can map one to the other then. If you purge GUIDs every so often, you defeat spidering completely. – SilverbackNet Jul 29 '12 at 05:17
  • I agree, a GUID would be a great answer, but I'd also like to be able to send e-mails, maybe even bookmark the URL, thus my desire for a persistent ID. A user could intentionally find a search that only returned one result, and run the search repeatedly to collect encrypted variants of the ID. Or, maybe there's a much more severe weakness from returning the same 1000 results 100 times. I think the term is called http://en.wikipedia.org/wiki/Differential_cryptanalysis , and it looks like none of the currently popular algorithms have attacks that are accessible to a lay-person? – shannon Jul 29 '12 at 05:25
  • I was shooting to use DES, although it is aging, because I have out-of-the-box access to it in .NET and I can encrypt small block sizes (64 bits). I haven't found any articles that suggest that it is especially vulnerable to differential cryptanalysis, so I'll probably just proceed with it. I'll accept your answer based on your comment about plaintexts, which helped me on the Google path to Differential Cryptanalysis. Thanks! – shannon Jul 29 '12 at 05:42
  • DES is actually exceptionally strong against differential attacks, among older/simpler ciphers, thanks to the NSA. You should be in good shape. DES is a reasonable choice for generating relatively few ciphertexts, especially given the tiny text size, and EDE 3DES is currently accepted as a strong cipher. – SilverbackNet Jul 29 '12 at 06:22