1

I have a Kotlin Vector class that includes a function to calculate the dot product of two Vectors:

class Vector(val values: Array<Double>) {

    fun dot(v: Vector): Double {
        require(this.values.size == v.values.size)
        var product = 0.0
        for (i in this.values.indices) {
            product += this.values[i] * v.values[i]
        }
        return product
    }
}

I'd like to express the dot product of two vectors in a functional style. Fold would initialize, but I don't see how to make it work with two arrays.

Does anyone have a suggestion?

duffymo
  • 305,152
  • 44
  • 369
  • 561

3 Answers3

3

I would zip the the two arrays, and then use sumByDouble:

fun dot(v: Vector): Double = values
    .apply { require(size == v.values.size) }
    .zip(v.values)
    .sumByDouble { (a, b) -> a * b }

Sidenote: If you're on the JVM, you could consider using DoubleArray instead of Array<Double>. The former will be represented as a double[] on the JVM, rather than an array of boxed Doubles.

Adam Millerchip
  • 20,844
  • 5
  • 51
  • 74
3

The accepted answer is accurate, but it's not that efficient: zipping the arrays together allocates a new list, copies both source arrays, and produces a Pair for each data element, all of which are disposed soon after. This more than doubles memory consumption.

I don't think your code above is really that bad, although I'd probably implement it as an extension function on DoubleArray:

infix fun DoubleArray.dot(other: DoubleArray): Double {
  var out = 0.0
  for (i in 0 until size) out += this[i] * other[i]
  return out
}

This is purely functional in that it has no side effects, and always returns the same result given the same inputs. Your consumers will never know your little for loop secret.

You can do it with foldIndexed though, for that slick, functional Kotlin styling:

infix fun DoubleArray.dot(other: DoubleArray) =
   foldIndexed(0.0) { i, acc, cur -> acc + cur * other[i] }

I find this slightly harder to read though (all those curs and accs, plus an odd floating 0.0); and under the hood, foldIndexed is just using a for loop anyway.

Mark McKenna
  • 2,857
  • 1
  • 17
  • 17
  • 1
    You're right about the performance. I tested on a 1M entry array, and your loop and `foldedIndex` are about 18x faster (~90ms vs ~5ms). The loop answer is exactly what the OP said they didn't want to do, but +1 for `foldIndexed`. – Adam Millerchip Nov 08 '20 at 07:05
  • 1
    PS I fixed a bug in your loop condition, a good example of why sticking to the built in functions is better :) – Adam Millerchip Nov 08 '20 at 07:53
  • Thanks @AdamMillerchip for the profiling and correction! Honestly, I'm surprised they weren't both wrong. In my own code (I got here because I was looking up dot product) I went with `foldIndexed` so fortunately I didn't retain my own bug... – Mark McKenna Nov 09 '20 at 13:57
  • Thanks to another anonymous commenter who pointed out that in my first block I had accidentally added the vector constituents, instead of multiplying them. This is now corrected. – Mark McKenna Feb 10 '21 at 15:14
2

You can zip them with a transform function and then sum the total:

return values.asSequence().zip(v.values.asSequence()) { a, b -> a * b }.sum()

or

return values.zip(v.values) { a, b -> a * b }.sum()

or

return values.zip(v.values, Double::times).sum()
Tenfour04
  • 83,111
  • 11
  • 94
  • 154