36

I'm trying to write a hash that will accept all datatypes. Once in the function, I handle the data as a byte array. I'm having trouble figuring out how to cast an arbitrary interface{} to a byte array.

I tried using the binary package but it seemed to depend on the type of data passed in. One of the parameters of the Write() fn (docs) required knowing the byte order of the parameter.

All datatype sizes are some multiple of a byte (even the bool), so this should be simple in theory.

Code in question below,

package bloom

import (
    "encoding/gob"
    "bytes"
)

// adapted from http://bretmulvey.com/hash/7.html
func ComputeHash(key interface{}) (uint, error) {
    var buf bytes.Buffer
    enc := gob.NewEncoder(&buf)
    err := enc.Encode(key)
    if err != nil {
        return 0, err
    }
    data := buf.Bytes()

    var a, b, c uint
    a, b = 0x9e3779b9, 0x9e3779b9
    c = 0;
    i := 0;

    for i = 0; i < len(data)-12; {
        a += uint(data[i+1] | data[i+2] << 8 | data[i+3] << 16 | data[i+4] << 24)
        i += 4
        b += uint(data[i+1] | data[i+2] << 8 | data[i+3] << 16 | data[i+4] << 24)
        i += 4
        c += uint(data[i+1] | data[i+2] << 8 | data[i+3] << 16 | data[i+4] << 24)

        a, b, c = mix(a, b, c);
    }

    c += uint(len(data))

    if i < len(data) {
        a += uint(data[i])
        i++
    }
    if i < len(data) {
        a += uint(data[i] << 8)
        i++
    }
    if i < len(data) {
        a += uint(data[i] << 16)
        i++
    }
    if i < len(data) {
        a += uint(data[i] << 24)
        i++
    }


    if i < len(data) {
        b += uint(data[i])
        i++
    }
    if i < len(data) {
        b += uint(data[i] << 8)
        i++
    }
    if i < len(data) {
        b += uint(data[i] << 16)
        i++
    }
    if i < len(data) {
        b += uint(data[i] << 24)
        i++
    }

    if i < len(data) {
        c += uint(data[i] << 8)
        i++
    }
    if i < len(data) {
        c += uint(data[i] << 16)
        i++
    }
    if i < len(data) {
        c += uint(data[i] << 24)
        i++
    }

    a, b, c = mix(a, b, c)
    return c, nil
}

func mix(a, b, c uint) (uint, uint, uint){
    a -= b; a -= c; a ^= (c>>13);
    b -= c; b -= a; b ^= (a<<8);
    c -= a; c -= b; c ^= (b>>13);
    a -= b; a -= c; a ^= (c>>12);
    b -= c; b -= a; b ^= (a<<16);
    c -= a; c -= b; c ^= (b>>5);
    a -= b; a -= c; a ^= (c>>3);
    b -= c; b -= a; b ^= (a<<10);
    c -= a; c -= b; c ^= (b>>15);

    return a, b, c
}
Nate Brennand
  • 1,488
  • 1
  • 12
  • 20
  • 3
    How about pkg "encoding/gob"? Can you use it? – nvcnvn Apr 11 '14 at 05:10
  • @nvcnvn, seems to be working. I tried it earlier but now I realize there's a weakness in the hash on small values (0-62 are identical?). I changed the range I was working with any it now seems to work. Thanks! – Nate Brennand Apr 11 '14 at 05:26
  • fixed the errors in the hash fn, updated code found here: https://gist.github.com/natebrennand/10442587 – Nate Brennand Apr 11 '14 at 05:39

2 Answers2

77

Other problems in my code led me away from the gob package earlier, turns out it was the proper way as @nvcnvn suggested. Relevant code on how to solve this issue below:

package bloom

import (
    "encoding/gob"
    "bytes"
)

func GetBytes(key interface{}) ([]byte, error) {
    var buf bytes.Buffer
    enc := gob.NewEncoder(&buf)
    err := enc.Encode(key)
    if err != nil {
        return nil, err
    }
    return buf.Bytes(), nil
}
Nate Brennand
  • 1,488
  • 1
  • 12
  • 20
  • 3
    Feel free to accept your own answer as the answer to your question :) – photoionized Apr 11 '14 at 05:54
  • 3
    Would be nice to have a example showing the usage of this method. For many beginners like me with Go, it would really help allot! – Rudi Strydom Nov 24 '15 at 17:59
  • 1
    @RudiStrydom convert a map to bytes to save space? That is what I found this useful for. – Bryce Wayne Jul 25 '21 at 05:38
  • At the time, I was trying to write a bloom filter that worked with arbitrary data structures. So by converting any struct/map/slice to`[]byte`, I could then process the byte slice into a hash. https://gist.github.com/natebrennand/10442587 – Nate Brennand Jul 26 '21 at 15:53
5

Another way to convert interface{} to []bytes is to use a fmt package.

/*
* Convert variable `key` from interface{} to []byte
*/

byteKey := []byte(fmt.Sprintf("%v", key.(interface{})))

fmt.Sprintf converts interface value to string.
[]byte converts string value to byte.

※ Note ※ This method does not work if interface{} value is a pointer. Please find @PassKit's comment below.

Midori
  • 471
  • 1
  • 7
  • 11