0

I need to store a sequence of bytes (a hash) in a case class.

My first approach has been to use an Array[Byte] but it breaks the equality property of the case class. The next example fails.

"compare arrays" in {
   case class CaseClassWithHash(id: Int, hash: Array[Byte])
   CaseClassWithHash(0, Array[Byte](192.toByte, 168.toByte)) == CaseClassWithHash(0, Array[Byte](192.toByte, 168.toByte)) shouldBe true
}

So my question is, which is the best way to represent the array of bytes:

  • I'm not going to manipulate it.
  • I need == working in the case class (previous unit test).
  • Memory usage is critical.
  • It is going to be always 32 bytes (sha256).

P.S. Case Class equality for Arrays is not an answer to my question. I'm asking for the right replacement of an Array to represent a SHA256 value and of course, overwrite the equals function is not the way.

angelcervera
  • 3,699
  • 1
  • 40
  • 68

1 Answers1

0

Changing Array to Seq seems to work for me. I'm not sure about memory usage but Seq is pretty reliable and quick.

case class CaseClassWithHash(id: Int, hash: Seq[Byte])

println(CaseClassWithHash(0, Seq[Byte](192.toByte, 168.toByte)) == CaseClassWithHash(0, Seq[Byte](192.toByte, 168.toByte)))
// true

As the linked potential duplicate suggests (link points here), if you insist on using Arrays then you will need to define a custom equals method.

James Whiteley
  • 3,363
  • 1
  • 19
  • 46
  • That question is different. I don't want to use an Array but I don't know which is the right replacement. – angelcervera Nov 05 '18 at 13:22
  • Pretty sure that Seq is not the best bet for this case. The default implementation is an immutable List. I think that it is a liked list so probably I will use more than the double of memory. Maybe Vector is a better solution. – angelcervera Nov 05 '18 at 13:25
  • And of course, I don't want to overwrite the equals function. I want to replace the type. – angelcervera Nov 05 '18 at 13:32
  • 1
    If memory is crucial, then Vector is the solution. It start "branching" after 32 elements and it memory consumption is low. See [benchmark](http://www.lihaoyi.com/post/BenchmarkingScalaCollections.html). – Yevhenii Popadiuk Nov 05 '18 at 13:33
  • 1
    @YevheniiPopadiuk Vectors are brilliant and I did have them in my original draft of this answer as an alternative. Not sure why I took that part out. – James Whiteley Nov 05 '18 at 13:44
  • I heard somewhere that Seq automatically converts itself to either a Vector or List under-the-hood depending on which is more efficient for the task at hand. I could be misremembering though – James Whiteley Nov 05 '18 at 13:46