8

Starting from a sorted sequence of values, my goal is to assign a rank to each value, using identical ranks for equal values (aka ties):

Input: Vector(1, 1, 3, 3, 3, 5, 6)

Output: Vector((0,1), (0,1), (1,3), (1,3), (1,3), (2,5), (3,6))

A few type aliases for readability:

type Rank = Int
type Value = Int
type RankValuePair = (Rank, Value)

An imperative implementation using a mutable rank variable could look like this:

var rank = 0
val ranked1: Vector[RankValuePair] = for ((value, index) <- values.zipWithIndex) yield {
  if ((index > 0) && (values(index - 1) != value)) rank += 1
  (rank, value)
}

// ranked1: Vector((0,1), (0,1), (1,3), (1,3), (1,3), (2,5), (3,6))

To hone my FP skills, I was trying to come up with a functional implementation:

val ranked2: Vector[RankValuePair] = values.sliding(2).foldLeft((0 , Vector.empty[RankValuePair])) {
  case ((rank: Rank, rankedValues: Vector[RankValuePair]), Vector(currentValue, nextValue)) =>
    val newRank = if (nextValue > currentValue) rank + 1 else rank
    val newRankedValues =  rankedValues :+ (rank, currentValue)
    (newRank, newRankedValues)
}._2

// ranked2: Vector((0,1), (0,1), (1,3), (1,3), (1,3), (2,5))

It is less readable, and – more importantly – is missing the last value (due to using sliding(2) on an odd number of values).

How could this be fixed and improved?

Rahel Lüthy
  • 6,837
  • 3
  • 36
  • 51
  • 1
    Why do you need it to be a vector ? Can it not be a list ? – sarveshseri Nov 02 '16 at 05:48
  • 2
    It shouldn't matter whether it's a `List` or a `Vector`. I just chose `Vector` in the example because it is a [good default](http://stackoverflow.com/questions/6928327/when-should-i-choose-vector-in-scala) (index-based access as used in the imperative implementation would be very inefficient with `List`). – Rahel Lüthy Nov 02 '16 at 05:51
  • @netzwerg Oh yeah it matters in some cases: http://stackoverflow.com/a/10199441/371804 – Haspemulator Nov 02 '16 at 09:24
  • @Haspemulator Ah, I see, tx! – Rahel Lüthy Nov 02 '16 at 10:36

6 Answers6

11

This works well for me:

// scala
val vs = Vector(1, 1, 3, 3, 3, 5, 6)
val rank = vs.distinct.zipWithIndex.toMap
val result = vs.map(i => (rank(i), i))

The same in Java 8 using Javaslang:

// java(slang)
Vector<Integer>                  vs = Vector(1, 1, 3, 3, 3, 5, 6);
Function<Integer, Integer>       rank = vs.distinct().zipWithIndex().toMap(t -> t);
Vector<Tuple2<Integer, Integer>> result = vs.map(i -> Tuple(rank.apply(i), i));

The output of both variants is

Vector((0, 1), (0, 1), (1, 3), (1, 3), (1, 3), (2, 5), (3, 6))

*) Disclosure: I'm the creator of Javaslang

Daniel Dietrich
  • 2,262
  • 20
  • 25
5

This is nice and concise but it assumes that your Values don't go negative. (Actually it just assumes that they can never start with -1.)

val vs: Vector[Value] = Vector(1, 1, 3, 3, 3, 5, 6)

val rvps: Vector[RankValuePair] =
  vs.scanLeft((-1,-1)){ case ((r,p), v) =>
    if (p == v) (r, v) else (r + 1, v)
  }.tail

edit

Modification that makes no assumptions, as suggested by @Kolmar.

vs.scanLeft((0,vs.headOption.getOrElse(0))){ case ((r,p), v) =>
  if (p == v) (r, v) else (r + 1, v)
}.tail
jwvh
  • 50,871
  • 7
  • 38
  • 64
1

Here's an approach with recursion, pattern matching and guards.

The interesting part is where the head and head of the tail (h and ht respectively) are de-constructed from the list and an if checks if they are equal. The logic for each case adjusts the rank and proceeds on the remaining part of the list.

def rank(xs: Vector[Value]): List[RankValuePair] = {
  def rankR(xs: List[Value], acc: List[RankValuePair], rank: Rank): List[RankValuePair] = xs match{
    case Nil => acc.reverse
    case h :: Nil => rankR(Nil, (rank, h) :: acc, rank)
    case h :: ht :: t if (h == ht) => rankR(xs.tail, (rank, h) :: acc, rank)
    case h :: ht :: t if (h != ht) => rankR(xs.tail, (rank, h) :: acc, rank + 1)
  }
  rankR(xs.toList, List[RankValuePair](), 0)
}

Output:

scala> rank(xs)
res14: List[RankValuePair] = List((0,1), (0,1), (1,3), (1,3), (1,3), (2,5), (3,6))
Brian
  • 20,195
  • 6
  • 34
  • 55
1

This is a modification of the solution by @jwvh, that doesn't make any assumptions about the values:

val vs = Vector(1, 1, 3, 3, 3, 5, 6)
vs.sliding(2).scanLeft(0, vs.head) {
  case ((rank, _), Seq(a, b)) => (if (a != b) rank + 1 else rank, b)
}.toVector

Note, that it would throw if vs is empty, so you'd have to use vs.headOption getOrElse 0, or check if the input is empty beforehand: if (vs.isEmpty) Vector.empty else ...

Kolmar
  • 14,086
  • 1
  • 22
  • 25
  • Nicely done, but this will throw if `vs` is empty. Mine won't. (Trade-offs. There are always trade-offs.) – jwvh Nov 02 '16 at 08:00
  • @jwvh It's only the `head` that throws there, and in that case we can actually use any default value, because the result will be an empty `Vector` anyway. (Or just check for an empty `Vector` before) – Kolmar Nov 02 '16 at 08:15
  • Ah, `headOption OrElse`, I should have thought of that. Kudos. – jwvh Nov 02 '16 at 08:23
0
import scala.annotation.tailrec

type Rank = Int
// defined type alias Rank
type Value = Int
// defined type alias Value
type RankValuePair = (Rank, Value)
// defined type alias RankValuePair

def rankThem(values: List[Value]): List[RankValuePair] = {

  // Assumes that the "values" are sorted
  @tailrec
  def _rankThem(currentRank: Rank, currentValue: Value, ranked: List[RankValuePair], values: List[Value]): List[RankValuePair] = values match {
    case value :: tail if value == currentValue => _rankThem(currentRank, value, (currentRank, value) +: ranked, tail)
    case value :: tail if value > currentValue => _rankThem(currentRank + 1, value, (currentRank + 1, value) +: ranked, tail)
    case Nil => ranked.reverse
  }

  _rankThem(0, Int.MinValue, List.empty[RankValuePair], values.sorted)
}
// rankThem: rankThem[](val values: List[Value]) => List[RankValuePair]

val valueList = List(1, 1, 3, 3, 5, 6)
// valueList: List[Int] = List(1, 1, 3, 3, 5, 6)

val rankValueList = rankThem(valueList)[RankedValuePair], values: Vector[Value])
// rankValueList: List[RankValuePair] = List((1,1), (1,1), (2,3), (2,3), (3,5), (4,6))
sarveshseri
  • 13,738
  • 28
  • 47
0
val list = List(1, 1, 3, 3, 5, 6)
val result = list
    .groupBy(identity)
    .mapValues(_.size)
    .toArray
    .sortBy(_._1)
    .zipWithIndex
    .flatMap(tuple => List.fill(tuple._1._2)((tuple._2, tuple._1._1)))
result: Array[(Int, Int)] = Array((0,1), (0,1), (1,3), (1,3), (2,5), (3,6))

The idea is using groupBy to find identical elements and find their occurrences and then sort and then flatMap. Time complexity I would say is O(nlogn), groupBy is O(n), sort is O(nlogn), fl

zhang rui
  • 81
  • 1
  • 4