0

I'm looking for advice on building a better hash value to reduce collisions for an array expected to hold 100,000 vertices.

The hash property is part of a struct designed to hold vertex data, each vertex would hold 8 float values. The float values can be negative and extend 15 spaces passed the decimal point.

I made two computed hashValue properties

First Version

The first version creates a String from the 8 properties of the Vertex struct, and returns the hash value of the string.

var hashValue: Int {
        return "\(self.x),\(self.y),\(self.z),\(self.nx),\(self.ny),\(self.nz),\(self.s),\(self.t),".hashValue
    }

Second Version

The second version loops through all the properties in Vertex struct. Multiplies to move the decimal place 15 spaces to the right, then makes the value absolute before finally changing the value's type from Float to Int and adding to the hash total.

var hashValue: Int {
        let array = [self.x, self.y, self.z, self.nx, self.ny, self.nz, self.s, self.t]

        var hash = 0
        for number in array {
            let absolute = abs(number * 1e15)
            let int = Int(absolute)
            hash += int
        }

        return hash
    }

Both versions ended up producing similar hash collision counts. I was hoping to reduce it further.

All the examples I've seen regarding hash values have been structs with two properties like this.

var hashValue: Int {
        return x.hashValue ^ y.hashValue
    }

I'm not sure if my approach to creating the hash value for a struct with 8 properties is the correct way to go. Any advice would be appreciated, thanks for reading.

Just wanted to clarify, this hash value is for the vertices stored in the array, not the array itself. Sorry for any confusion.

Phi
  • 157
  • 1
  • 8
  • That's a truly awful hashing function. Why not work from [better examples](https://en.wikipedia.org/wiki/List_of_hash_functions)? Developing one that's quick, efficient, and produces the desired level of pseudo-randomness is not a trivial task. – tadman Nov 22 '16 at 21:18
  • thanks for the link tadman, I'll check that out. I'm really new to programming, I just started a few months ago, so forgive me if I come off as clueless. – Phi Nov 22 '16 at 21:24
  • wow that link you shared has a lot of types of functions, I'm feeling a bit lost, do you have any other resource recommendations for a beginner? – Phi Nov 22 '16 at 21:28
  • FYI - a hash value isn't used by an array. You should provide more information in your question about your supposed issue with collisions. – rmaddy Nov 22 '16 at 21:39
  • See also [this question on hash functions](http://stackoverflow.com/questions/34595/what-is-a-good-hash-function) with recommendations. – tadman Nov 22 '16 at 21:42
  • Basically it would be two sets of vertex arrays, one is unique vertices the other contains the duplicates. I need to compare vertices from both arrays to find matches, each match creates a new integer for indexing. – Phi Nov 22 '16 at 21:43
  • Here's a link to the post where this question originates from. [link] (http://stackoverflow.com/questions/40733050/inefficient-parsing-of-vertex-and-index-data-looking-for-more-efficient-methods/40734161?noredirect=1#comment68695685_40734161) – Phi Nov 22 '16 at 21:43
  • thank you tadman appreciate it! – Phi Nov 22 '16 at 21:45

0 Answers0