-1

I'm working on a hashing function for a map[string]interface{}

Most of the hashing libs required []byte as input to compute the hash.

I tried to Marshal using the json.Marshal for simple maps it works correct but when i add some complexity and shuffled the items then json.Marshal fails to give me a consistent byte array output

package main

import (
    "encoding/json"
    "fmt"
)

func main() {
    data := map[string]interface{}{
        "id":    "124",
        "name":  "name",
        "count": 123456,
        "sites": []map[string]interface{}{
            {
                "name":  "123445",
                "count": 234324,
                "id":    "wersfs",
            },
            {
                "id":    "sadcacasca",
                "name":  "sdvcscds",
                "count": 22,
            },
        },
        "list": []int{5, 324, 123, 123, 123, 14, 34, 52, 3},
    }

    data1 := map[string]interface{}{
        "name": "name",
        "id":   "124",
        "sites": []map[string]interface{}{
            {
                "id":    "sadcacasca",
                "count": 22,
                "name":  "sdvcscds",
            },
            {
                "count": 234324,
                "name":  "123445",
                "id":    "wersfs",
            },
        },
        "count": 123456,
        "list":  []int{123, 14, 34, 52, 3, 5, 324, 123, 123},
    }

    jsonStr, _ := json.Marshal(data)
    jsonStr1, _ := json.Marshal(data1)
    fmt.Println(jsonStr)
    fmt.Println(jsonStr1)

    for i := 0; i < len(jsonStr); i++ {
        if jsonStr[i] != jsonStr1[i] {
            fmt.Println("Byte arrays not equal")
        }
    }

}

This is what I have tried and it fails to give me a consistent output.
Moreover i was thinking to write a function which will do the sorting of the map and values as well, but then got stuck on how do I sort the

"sites": []map[string]interface{}

I tried json.Marshal and also sorting the map but got stuck

  • 2
    `encoding/json` marshals maps by sorting the keys. Slices however are not sorted! You must sort slices if you want the same output, but you do not have to do that with maps (you can't even sort maps as they are unordered in Go). See [How to iterate maps in insertion order?](https://stackoverflow.com/questions/28930416/how-to-iterate-maps-in-insertion-order/28931555#28931555) – icza Nov 08 '22 at 11:42

2 Answers2

1

Your data sructures are not equivalent. According to JSON rules arrays are ordered, therefore [123, 14, 34, 52, 3, 5, 324, 123, 123] is not the same as [5, 324, 123, 123, 123, 14, 34, 52, 3]. No wonders the hashes are different. If you need different arrays with the same elements to produce the same hash, you need to canonicalize the arrays before hashing. E.g. sort them.

Here is how it could be done: https://go.dev/play/p/OHq7jsX_cNw

Before serilizing it recursively gos down the maps and arrays and prepares all arrays:

// Prepares data by sorting arrays in place
func prepare(data map[string]any) map[string]any {
    for _, value := range data {
        switch v := value.(type) {
        case []int:
            prepareIntArray(v)
        case []string:
            prepareStringArray(v)
        case []map[string]any:
            prepareMapArrayById(v)
            for _, obj := range v {
                prepare(obj)
            }
        case map[string]any:
            prepare(v)
        }
    }
    return data
}

// Sorts int array in place
func prepareIntArray(a []int) {
    sort.Ints(a)
}

// Sorts string array in place
func prepareStringArray(a []string) {
    sort.Strings(a)
}

// Sorts an array of objects by "id" fields
func prepareMapArrayById(mapSlice []map[string]any) {
    sort.Slice(mapSlice, func(i, j int) bool {
        return getId(mapSlice[i]) < getId(mapSlice[j])
    })
}

// Extracts "id" field from JSON object. Returns empty string if there is no "id" or it is not a string.
func getId(v map[string]any) string {
    idAny, ok := v["id"]
    if !ok {
        return ""
    }
    idStr, ok := idAny.(string)
    if ok {
        return idStr
    } else {
        return ""
    }
}
Pak Uula
  • 2,750
  • 1
  • 8
  • 13
-1

As both the marshaled outputs are basically string representations of the same map in different sequences, if you sort their characters, they become equal.

following this logic, if you sort both jsonStr and jsonStr1, the sorted []byte(s) will be exactly equal. which then you can use to formulate your hash value.

check my solution here

  • 1
    This is a wrong idea. Sorting `abc` will be the same as `cba`, yet they are not equal. – icza Nov 08 '22 at 20:25
  • I didn't get it. if `sort(bac)=abc` and `sort(cba)=abc`, then isn't `sort(bac)=sort(cba)`. Maybe I am missing something here, can you explain why would this be the wrong approach? – Anurag Kumar Nov 10 '22 at 17:15
  • He is saying that although two equal maps with different orders will be the same when their `[]byte` arrays are sorted, so will two different maps that just happen to have the same characters within it. So `{"b": "a"}` and `{"a": "b"}` would become equal, even though they are different. – Marko Jul 11 '23 at 13:29