0

For a project that I am working on right now, I was wondering how I could add a progress indicator to the Openssl SHA256 hashing function. I will be working with some large files (1-10+ GB) and would like to be able to see the progress completed versus left. I have implemented the Stack Overflow questions here and here, and have the SHA256 hash generating working correctly (it only lacks a progress indicator).

I am fairly new to C, so this will be a learning experience for me. I thought that I might be able to use fread or SHA256_Update somehow, but I am having trouble understanding exactly how I would get a response of how much of the file has been read.


char _genSHA256Hash(char *path, char outputBuffer[65])
{
  FILE *file = fopen(path, "rb");
  if(!file) return -1;

  unsigned char hash[SHA256_DIGEST_LENGTH];
  SHA256_CTX sha256;
  SHA256_Init(&sha256);
  const int bufSize = 32768;
  char *buffer = malloc(bufSize);
  int bytesRead = 0;
  if(!buffer) return -1;
  while((bytesRead = fread(buffer, 1, bufSize, file)))
  {
      SHA256_Update(&sha256, buffer, bytesRead);
  }
  SHA256_Final(hash, &sha256);

  sha256_hash_string(hash, outputBuffer);

  //fclose(file);
  //free(buffer);

  return 0;
}

EDIT: I have added the sha256_hash_string function:

void sha256_hash_string (unsigned char hash[SHA256_DIGEST_LENGTH], char outputBuffer[65]) {
    int i = 0;

    for(i = 0; i < SHA256_DIGEST_LENGTH; i++)
    {
        sprintf(outputBuffer + (i * 2), "%02x", hash[i]);
    }

    outputBuffer[64] = 0;
}

Thank you,

Tim

Community
  • 1
  • 1
cyberboxster
  • 723
  • 1
  • 6
  • 12
  • 2
    You'll need a `stat` or some such to tell you how big the file is, then do the math for delta yourself while the loop iterates. – WhozCraig Jan 03 '15 at 03:32

1 Answers1

1

You already have information about how much of the file has been read, you just need to account for it:

int bytesRead = 0;
int totRead = 0;
while((bytesRead = fread(buffer, 1, bufSize, file))) {
    totRead += bytesRead;
    ...

All you need now is the total size of the file. You can get this with:

struct stat sb;
if (stat(path, &sb) == -1) {
    perror("stat");
    return 0;
}

totRead keeps a running counter; bytesRead is reset each time through the loop. Then, in your loop, totRead / sb.st_size represents the percentage of your hashing progress.

dho
  • 2,310
  • 18
  • 20
  • It returns char because I call the function like so: "char buffer[65]; _genSHA256Hash("file", buffer); printf("%s\n", buffer);" I guess I could have made the function void though and just print the value inside the function itself and not return anything. – cyberboxster Jan 03 '15 at 20:43
  • The function modifies the data inside the array in the buffer passed to it, which is not the same as the return type. It's idiomatic to return `int` in these cases -- `char` is weird because it's basically guaranteed that nobody is going to call this function and interpret its result as a character. – dho Jan 03 '15 at 20:57
  • I see what you are saying. I think I got momentarily confused. I will change the _genSHA256Hash function to return void or int. – cyberboxster Jan 03 '15 at 21:01
  • It appears that when I try to print the size of the file read versus left, bytesRead is always at value 32768. I added the following statement to the while loop: 'printf("percent done: %d of %zu\n", bytesRead, sb.st_size);' I can upload a screenshot somewhere for you to look at, but it looks like bytesRead is always the integer 32768 and then it suddenly stops and breaks out of the loop. Thanks, Tim – cyberboxster Jan 03 '15 at 21:23
  • Here is a photo: http://pbrd.co/1tIrH2J and here is a pastebin: http://pastebin.com/qiNz6uNh – cyberboxster Jan 03 '15 at 21:39
  • Sorry, I'm an idiot. bytesRead is going to be your fread buffer size each time, which is bufSize, which is 32768. You will need to keep a running counter. I'll update my response. – dho Jan 03 '15 at 22:11