1

I am working on project which requires distribution of files on different servers. For the distribution scheme I chose to use SHA1 algorithm and take the last 64 bits (out of 160 bits hash) to identify the file.

I am not sure if it is my fault or not but I am not able to get as an int a stable value of the last 64 bit of the hash.

What I tried is this:

   char *plaintext = "file";
   size_t len = strlen(plaintext);
   char hash[41];

   /*hash contains the hash of the file as char* */
   plaintext_to_sha1(hash, plaintext, len);

   /*get last 64 bits of the hash*/
   uint64_t  value = (uint64_t)(hash + (24 * sizeof(char)));
   printk(LOG_LEVEL "value: %llu\n", value);

The value contained by value is sometimes different and I do not understand what I am doing wrong. I am taking the last 64 bits by casting to int the hash shifted 24 bytes to the right.

Any suggestions are appreciated.

Ion Morozan
  • 783
  • 1
  • 9
  • 13
  • You are setting `value` to `hash + 24`, i.e., a memory address. – ntoskrnl May 23 '14 at 19:49
  • I tried also using memcpy like 'memcpy(last, hash+24, 16)' still no correct value. 'last' is a 'char[16]' – Ion Morozan May 23 '14 at 19:58
  • 1
    First of all, normally the *first* bytes are taken from a hash value if a limited amount of hash bits are required. This is not insecure or anything, it's just going against convention. Second, be aware that SHA-1 is more or less considered to have a minimum hash length. Because of the birthday problem, you may get a clash if you store a lot (billions) of files. – Maarten Bodewes May 24 '14 at 00:04

2 Answers2

0

Use following :

memcpy(&value,hash+12,8);
Arshan
  • 736
  • 6
  • 19
  • Thank you for your answer! Why not `memcpy(&value, hash+24, 16);` ? – Ion Morozan May 23 '14 at 20:13
  • Because the hash is 20 bytes in total.. Isnt it?? So copying 8 bytes (64 bits) next to first 12 bytes (96 bits) of hash. – Arshan May 23 '14 at 20:23
  • `hash` is a 40 bytes char array which stores `hash("file")` – Ion Morozan May 23 '14 at 20:26
  • According to http://tools.ietf.org/html/rfc3174, i was thinking Hash size of 160 bits (20 bytes). – Arshan May 23 '14 at 20:33
  • You are right that the size is 20 bytes, but as I see in this example(http://stackoverflow.com/questions/16861332/how-to-compute-sha1-of-an-array-in-linux-kernel) the result will be held in a 40 bytes array of chars. – Ion Morozan May 23 '14 at 20:38
  • 2
    Hmmm, i think that the hash is 20 bytes but after converting it from hex to char string using sprintf(), it is stored in 40 bytes array... So if you want to copy last 64 bits of hash from hash string, then memcpy(last,hash+24,16) should work. – Arshan May 23 '14 at 20:58
0

I think the memcpy does not give a correct answer. The way I solved the same problem was using strtoull (which converts a string to a 64 bit integer). You need to make sure that your compiler supports it first.

In your case, you can use it like this:

unsigned long long int value = 0;
value = strtoull(hex_hash + 24, NULL, 16);

where hex_hash is the 40 bytes hex representation of the hash. When printing with %ull, the correct value is given, while memcpy was returning a different value.

You can check the correctness here.

andreea.sandu
  • 212
  • 1
  • 9