Security: longer keys versus more available characters

Question

I apologize if this has been answered before, but I was not able to find anything. This question was inspired by a comment on another security-related question here on SO:

How to generate a random, long salt for use in hashing?

The specific comment is as follows (sixth comment of accepted answer):

...Second, and more importantly, this will only return hexadecimal characters - i.e. 0-9 and A-F. It will never return a letter higher than an F. You're reducing your output to just 16 possible characters when there could be - and almost certainly are - many other valid characters.

– AgentConundrum Oct 14 '12 at 17:19

This got me thinking. Say I had some arbitrary series of bytes, with each byte being randomly distributed over 2^(8). Let this key be A. Now suppose I transformed A into its hexadecimal string representation, key B (ex. 0xde 0xad 0xbe 0xef => "d e a d b e e f").

Some things are readily apparent:

len(B) = 2 len(A)
The symbols in B are limited to 2^(4) discrete values while the symbols in A range over 2^(8)
A and B represent the same 'quantities', just using different encoding.

My suspicion is that, in this example, the two keys will end up being equally as secure (otherwise every password cracking tool would just convert one representation to another for quicker attacks). External to this contrived example, however, I suspect there is an important security moral to take away from this; especially when selecting a source of randomness.

So, in short, which is more desirable from a security stand point: longer keys or keys whose values cover more discrete symbols?

I am really interested in the theory behind this, so an extra bonus gold star (or at least my undying admiration) to anyone who can also provide the math / proof behind their conclusion.

please note that 2^8 is a single byte-strength key. NOT 8-byte, as I think you go on to assume. I shall continue reading and attempt to provide an answer — Russell Uhl, Jun 18 '13 at 21:36
Indeed. What I had meant to convey is that each discrete byte of the arbitrary-length key covers 2^(8). The example 8-byte key above would actually exist in 2^(64). My apologies for the ambiguity. — MysteryMoose, Jun 18 '13 at 21:47
I gotcha. I'm working on an answer...it's rather long. I JUST finished taking a crypto class, so I actually know about this stuff now! — Russell Uhl, Jun 18 '13 at 21:52
Sweet action, thanks. I edited the question to make it less ambiguous. — MysteryMoose, Jun 18 '13 at 21:53

score 2 · Answer 1 · answered Jun 18 '13 at 21:36

If the number of different symbols usable in your password is x, and the length is y, then the number of different possible passwords (and therefore the strength against brute-force attacks) is x ** y. So you want to maximize x ** y. Both adding to x or adding to y will do that, Which one makes the greater total depends on the actual numbers involved and what your practical limits are.

But generally, increasing x gives only polynomial growth while adding to y gives exponential growth. So in the long run, length wins.

+1 for the growth models, which were really helpful. Thanks for the info! — MysteryMoose, Jun 18 '13 at 23:23

score 1 · Accepted Answer · answered Jun 18 '13 at 22:04

Let's start with a binary string of length 8. The possible combinations are all permutations from 00000000 and 11111111. This gives us a keyspace of 2^8, or 256 possible keys. Now let's look at option A:

A: Adding one additional bit. We now have a 9-bit string, so the possible values are between 000000000 and 111111111, which gives us a keyspace size of 2^9, or 512 keys. We also have option B, however.

B: Adding an additional value to the keyspace (NOT the keyspace size!): Now let's pretend we have a trinary system, where the accepted numbers are 0, 1, and 2. Still assuming a string of length 8, we have 3^8, or 6561 keys...clearly much higher.

However! Trinary does not exist!

Let's look at your example. Please be aware I will be clarifying some of it, which you may have been confused about. Begin with a 4-BYTE (or 32-bit) bitstring: 11011110 10101101 10111110 11101111 (this is, btw, the bitstring equivalent to 0xDEADBEEF)

Since our possible values for each digit are 0 or 1, the base of our exponent is 2. Since there are 32 bits, we have 2^32 as the strength of this key. Now let's look at your second key, DEADBEEF. Each "digit" can be a value from 0-9, or A-F. This gives us 16 values. We have 8 "digits", so our exponent is 16^8...which also equals 2^32! So those keys are equal in strength (also, because they are the same thing).

But we're talking about REAL passwords, not just those silly little binary things. Consider an alphabetical password with only lowercase letters of length 8: we have 26 possible characters, and 8 of them, so the strength is 26^8, or 208.8 billion (takes about a minute to brute force). Adding one character to the length yields 26^9, or 5.4 trillion combinations: 20 minutes or so. Let's go back to our 8-char string, but add a character: the space character. now we have 27^8, which is 282 billion....FAR LESS than adding an additional character!

The proper solution, of course, is to do both: for instance, 27^9 is 7.6 trillion combinations, or about half an hour of cracking. An 8-character password using upper case, lower case, numbers, special symbols, and the space character would take around 20 days to crack....still not nearly strong enough. Add another character, and it's 5 years.

As a reference, I usually make my passwords upwards of 16 characters, and they have at least one Cap, one space, one number, and one special character. Such a password at 16 characters would take several (hundred) trillion years to brute force.

Interesting. I had expected that 0xDEADBEEF and "DEADBEEF" would be equal, but I had expected adding a character to the search space would win out as the character is 'available' to each index in the key. Also didn't know the bit about trinary. Cool write up! — MysteryMoose, Jun 18 '13 at 23:21
@phobos51594: just want to make the clarification. 0xDEADBEEF is hex code, with a keyspace of 16 per character. The word "DEADBEEF", however, has a keyspace of 26 per character (A-Z). Also, trinary doesn't exist (nor is it a word :P). I merely used it as an example to demonstrate the differences in adding complexity. — Russell Uhl, Jun 18 '13 at 23:53

Security: longer keys versus more available characters

2 Answers2