6

This question is about how to create an SHA-1 hash from an array of data in C using the OpenSSL library.

It returns an array of 20 bytes, containing the hash. Is there some standard way of representing that data in string form, not binary?

If so, is there a function in OpenSSL itself to convert to said string format?

If not, how should it be done? Of course, I can dream up my own encoding, use base64 or what not, but is there some canonical format?

Community
  • 1
  • 1
Prof. Falken
  • 24,226
  • 19
  • 100
  • 173

4 Answers4

15

Usually hashes are represented as a sequence of hexadecimal digits (naturally, two per byte). You can write the code to write such thing easily using an ostringstream with the right modifiers:

#include <string>
#include <sstream>
#include <iomanip>

std::string GetHexRepresentation(const unsigned char *Bytes, size_t Length) {
    std::ostringstream os;
    os.fill('0');
    os<<std::hex;
    for(const unsigned char *ptr = Bytes; ptr < Bytes+Length; ++ptr) {
        os<<std::setw(2)<<(unsigned int)*ptr;
    }
    return os.str();
}

Arguably, this can also be done more efficiently (and, to my today's eyes, more clearly) "by hand":

#include <string>

std::string GetHexRepresentation(const unsigned char *Bytes, size_t Length) {
    std::string ret(Length*2, '\0');
    const char *digits = "0123456789abcdef";
    for(size_t i = 0; i < Length; ++i) {
        ret[i*2]   = digits[(Bytes[i]>>4) & 0xf];
        ret[i*2+1] = digits[ Bytes[i]     & 0xf];
    }
    return ret;
}

or with good old sprintf, probably the easiest-to-read method of all:

#include <stdio.h>
#include <string>

std::string GetHexRepresentation(const unsigned char *Bytes, size_t Length) {
    std::string ret;
    ret.reserve(Length * 2);
    for(const unsigned char *ptr = Bytes; ptr < Bytes+Length; ++ptr) {
        char buf[3];
        sprintf(buf, "%02x", (*ptr)&0xff);
        ret += buf;
    }
    return ret;
}
Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • 1
    Thanks, and even a pretty C++ example for the C++ programmers out there. Nice work! – Prof. Falken Oct 19 '10 at 14:07
  • 1
    In the crypto context, base64 is a far more common textual representation for binary data (Such as hashes) than hexadecimal. – Chris Becke Oct 22 '10 at 12:03
  • 2
    For general and long binary data yes, but I've seen short data and hashes (which are usually short enough) almost always expressed in hexadecimal. – Matteo Italia Oct 22 '10 at 12:10
3

The standard way of representing hashes is as hexadecimal strings.
In C, you can use printf("%02x", byte) to get a hex representation of each byte.

An example for MD5, should be easy to adapt it for SHA:

http://en.literateprograms.org/MD5_sum_(C,_OpenSSL)

Arc
  • 11,143
  • 4
  • 52
  • 75
2

Here is an example for C:

//function
void convertSHA1BinaryToCharStr(const unsigned char * const hashbin, char * const hashstr) {
  for(int i = 0; i<20; ++i)
  {
    sprintf(&hashstr[i*2], "%02X", hashbin[i]);
  }
  hashstr[40]=0;
}

//Example call.  hashbin is the 20byte hash array.
char hashstr[41];
convertSHA1BinaryToCharStr(hashbin, hashstr);
printf("%s\n", hashstr);
James
  • 1,754
  • 14
  • 22
0

Privacy Enhanced Mail (or PEM) seemed to set the standard for the storage of text representations of crypto data. PEM stores the actual binary chunk in Base64, but has a text header and footer too.

Chris Becke
  • 34,244
  • 12
  • 79
  • 148
  • 1
    PEM defines standard formats for storing keys and certificates in files - and within a certificate a hash will indeed be base-64 encoded - but a hash on its own is usually just displayed in hex. If the has is a key fingerprint it is commonly displayed as pairs of hex digits separated by colons (see e.g. RFC4716) – dajames Oct 30 '10 at 18:09