0

Here is an entropy calculation based on an answer by Jeff Atwood : How to calculate the entropy of a file? which is based on http://en.wikipedia.org/wiki/Entropy_(information_theory) :

object MeasureEntropy extends App {

  val s = "measure measure here measure measure measure"

  def entropyValue(s: String) = {

    val m = s.split(" ").toList.groupBy((word: String) => word).mapValues(_.length.toDouble)
    var result: Double = 0.0;
    val len = s.split(" ").length;

    m map {
      case (key, value: Double) =>
        {
          var frequency: Double = value / len;
          result -= frequency * (scala.math.log(frequency) / scala.math.log(2));
        }
    }

    result;
  }

  println(entropyValue(s))
}

I'd like to improve this by removing the mutable state relating to :

var result: Double = 0.0;

How to combine the result into a single calculation over the map function ?

Community
  • 1
  • 1
blue-sky
  • 51,962
  • 152
  • 427
  • 752

3 Answers3

1

Using foldLeft, or in this case /: which is a syntactic sugar for it:

(0d /: m) {case (result, (key,value)) => 
  val frequency = value / len
  result - frequency * (scala.math.log(frequency) / scala.math.log(2))
}

Docs: http://www.scala-lang.org/files/archive/api/current/index.html#scala.collection.immutable.Map@/:B(op:(B,A)=>B):B

Alvaro Carrasco
  • 6,103
  • 16
  • 24
1

A simple sum will do the trick:

m.map {
  case (key, value: Double) =>
     val frequency: Double = value / len;
      - frequency * (scala.math.log(frequency) / scala.math.log(2));
}.sum
pedrofurla
  • 12,763
  • 1
  • 38
  • 49
1

It can be written using foldLeft like below.

  def entropyValue(s: String) = {
    val m = s.split(" ").toList.groupBy((word: String) => word).mapValues(_.length.toDouble)
    val len = s.split(" ").length
    m.foldLeft(0.0)((r, t) => r - ((t._2 / len) * (scala.math.log(t._2 / len) / scala.math.log(2))))
  }
Jegan
  • 1,721
  • 1
  • 20
  • 25