I build a program based on a rather complex mathematical algorithm. In this I want to account for vectors that have missing values, so NaN. Until now I implemented those by having two vectors - both implemented with breeze's DenseVector[Double]
: a vector location
which contains the actual values and a vector evidence
where a 1.0
denotes that the value is there and a 0.0
that a value isn't. With that I can do thing like this:
val ones = DenseVector.ones[Double](one.evidence.length)
val derivedLocation = one.evidence :* one.location :+ ((ones :- one.evidence) :* two.evidence :* two.location)
Another example would be:
val firstnewvector = myothervector(evidence :== 1.0)
val secondnewvector = myothervector(evidence :== 0.0)
but I also have some other example where I do need 0 as a result not NaN:
def gradientAt: DenseVector[Double] =
(one.location - two.location) :* evidence :* othervalue
For the sake of argument this example has been simplified. I am thinking about dropping evidence
and using NaN where there is no concrete value present, but I am not sure whether that is a good idea. I think it might already be more difficult to implement the above lines, wouldn't it? Also, I am not sure about performance. DenseVector
is backed by an Array containing Java primitives and preventing slow auto-boxing if I am not mistaken. Using Double.NaN
might require classes instead of primitives, and might slow the whole program down a little and cost more memory - is that right? (Speed and memory is a issue in general).
So: Is it a good idea in my case to use Double.NaN
or considering 1) nice code and 2) performance (memory and speed)?