What does it mean to hash the bytes of time_t?

Question

In my previous question I questioned the portability of srand(time(NULL)). This document entitled Using rand() provides a "way to use the result of time() portably as a seed for rand();". However, I don't understand what "just hash the bytes of a time_t" means nor what the code does.

unsigned time_seed()
{
   time_t now = time ( 0 );
   unsigned char *p = (unsigned char *)&now;
   unsigned seed = 0;
   size_t i;

   for ( i = 0; i < sizeof now; i++ )
     seed = seed * ( UCHAR_MAX + 2U ) + p[i];

   return seed;
}

srand ( time_seed() );

Can someone provide an explanation?

Warning: It's not at all clear that this operation is valid. Unless a type is uniquely represented, hashing its representation is not useful. — Kerrek SB, Oct 05 '14 at 21:43

b4hand · Answer 1 · 2014-10-05T22:02:29.000

The point of hashing the result of time is to avoid a predictable seed value. It would be used in a situation where you might be alright with an insecure pseudo-random number generator like rand, but still not want clients to be able to predictably determine your pseudo-random sequence.

There's lot's of ways that you can accomplish hashing the time_t. One very simple naive approach is to simply xor the running process pid. This is a classic approach that Unix systems have used for a long time; although, it doesn't really add that much more in the form of security. You could also include other forms of real entropy from the system. Other alternatives would involve a proper hash function or some combination of other data points and hashes. Examples of hashing functions include Bernstein's hash, the Fowler–Noll–Vo hash or other cryptographically secure hashes like MD5 or SHA1. However, if you're going to use a cryptographically secure hashing function, you should probably be using a cryptographically secure random number generator as well.

For what it is worth, since this is tagged with C++, you can use the built-in std::hash function provided in the standard library as long as you are using a C++ compiler that supports TR1 or later. In GCC, the std::hash function is implemented using the FNV hash mentioned above.

If you have real entropy available there's no point using the `time`, just use the real entropy. Rolling your own crypto is a Bad Idea (TM). — M.M, Oct 05 '14 at 23:01
@MattMcNabb I fully agree. I definitely wasn't recommending the entropy seeded `rand` approach, since `rand` is a *very poor psuedo-random number generator*. I just felt compelled to point out that I've seen it done before, and that other people might find it useful. — b4hand, Oct 06 '14 at 03:16

score 0 · Answer 2 · answered Oct 05 '14 at 21:33

0

Hashing time_t means to get a nearly unique number, a "finger-print", by applying a mathematical calculation ( hash function ) to the values in a structure carrying the current system time.

This way, every time you run your program, you will get a different value as a seed for srand.

Since the seed will be different, your sequence of random numbers from rand will also be different.

answered Oct 05 '14 at 21:33

Gonen I

5,576
1
29
60

But the seed will *already* be different if the returned time is different. `rand` with a seed `x` is different from a seed `x+1`. – Jongware Oct 05 '14 at 23:54
True. The claim is that the time_t values are not portable since they are not defined by the ISO C standard, and that hashing enables getting a unique int from it. – Gonen I Oct 06 '14 at 04:26

score 0 · Answer 3 · edited May 23 '17 at 12:14

The article in the link you reference argues that there is no guarantee that you can cast a time_t to an unsigned int, time_t is implementation dependent, and therefore casting to unsigned int is not strictly portable, yet you need an unsigned int to seed srand. The articles says:

The issue is that time_t is a restricted type, and may not be meaningfully converted to unsigned int.

The solution proposed is to perform a hash function on the bytes that make up the time_t value, that way you don't have a portability problem. Hash functions map an input sequence of bits into a value, usually in a specific range, in this case you want to map all bytes that make up time_t (sizeof time_t) into an unsigned int. Good hash functions will have an avalanche effect, which means a small change in the input (time changes by one second) will have a significant effect in the output (half of the output bits flip).

As to what the code does, it implements a hash function that iterates through each byte of time_t and adds it to the product of the current seed (modulus the size of unsigned) times a constant UCHAR_MAX + 2U. This hash function is a linear congruential generator. Notice that p is a pointer to an unsigned character that is used as the base of an array of characters that aliases the storage for time_t. Normally you shouldn't do this in C++ because of the strict aliasing rule, but there is an exception to the rule that allows you to alias another type as an array of characters (signed or unsigned).

I'm pretty certain the article is wrong with regards to `time_t`. `time_t` is guaranteed to be an alias of one of the primitive integral types as stated here: http://www.cplusplus.com/reference/ctime/time_t/ . As such, it can always be converted (possibly with some loss of information) to an `unsigned int`. — b4hand, Oct 05 '14 at 22:06
@b4hand I agree the article is being overly conservative... though the link you reference does say "Portable programs should not use values of this type directly, but always rely on calls to elements of the standard library to translate them to portable types." — amdn, Oct 05 '14 at 22:43
If you care about the *behavior* of the conversion across platforms or the result to be the same, then it is not portable, but just relying on the conversion to exist is perfectly portable and thus makes the code itself valid. — b4hand, Oct 06 '14 at 02:28

What does it mean to hash the bytes of time_t?

3 Answers3