0

I am struggling to build this computation pipeline builder in Scala. I want a class that has two methods, map and reduce, that receive anonymous functions in a "fluent interface". These functions will be composed, so I want to type-check all of them, also having their input type inferred from the previous method call... See this related question of mine (it's all part of the puzzle).

All my questions oversimplify the problem, but answers have been helpful, and I think I am almost arriving there.

I have managed to make everything work as long as I have as special method that I use when I register a mapper function that has a KeyVal output. But I wanted to use the same map name for the functions, and to simplify the architecture in general too. For that I decided to try using the type class pattern. That allows me to do different things depending on the type from the function in the argument of my builder method. Keep in mind too that part of my problem is that if I give to the mapper method a function that outputs a KeyVal[K,V] type (pretty much a tuple), and I need to store this K and V as type parameters from my builder class, so they can be used to type-check / infer the type from the reducer method later on.

This is my builder class

case class PipelineBuilder[A, V](commandSequence: List[MRBuildCommand]) {

  trait Foo[XA, XB, +XV] {
    def constructPB(xs: XA => XB): PipelineBuilder[XB, XV]
  }

  implicit def fooAny[XA, XB]: Foo[XA, XB, Nothing] = new Foo[XA, XB, Nothing] {
    def constructPB(ff: XA => XB) = PipelineBuilder[XB, Nothing](MapBuildCommand(ff) :: commandSequence)
  }

  implicit def fooKV[XA, XK, XV]: Foo[XA, KeyVal[XK,XV], XV] = new Foo[XA, KeyVal[XK,XV], XV] {
    def constructPB(ff: XA => KeyVal[XK,XV]) = PipelineBuilder[KeyVal[XK,XV], XV](MapBuildCommand(ff) :: commandSequence)
  }

  def innermymap[AA, FB, FV](ff: AA => FB)(implicit mapper: Foo[AA, FB, FV]) = mapper.constructPB(ff)

  def mymap[FB](ff: A => FB) = innermymap(ff)


  def rreduce[K](newFun: (V, V) => V)(implicit ev: KeyVal[K, V] =:= A) =
    PipelineBuilder[A,V](RedBuildCommand[K, V](newFun) :: commandSequence)

  def output(dest: MRWorker) = constructPipeline(dest)
  //...

}

And this is how the class is used in the main program

object PipelineLab extends App {

  val mapredPipeline = PipelineBuilder[String, Nothing](List())
    .mymap { s: String => s.toLowerCase }
    .mymap { s: String => KeyVal(s, 1) }
    .rreduce(_ + _)
    .output(OutputWorker)
  // ...
}

Note that the s: String shouldn't be necessary because if the type parameter A from the class. Same goes for V in the rreduce.

I have already managed to use the type class pattern in the following simple example. If I output a tuple of something, it does something different... Here it is.

object TypeClassLab extends App {

  trait FuncAdapter[A, B] {
    def runfunc(x: A, f: A => B): B
  }

  implicit def myfunplain[X, A]: FuncAdapter[X, A] = new FuncAdapter[X, A] {
    def runfunc(x: X, f: X => A): A = {
      println("Function of something to plain, non-tuple type")
      f(x)
    }
  }

  implicit def myfuntuple[X, AA, AB]: FuncAdapter[X, (AA, AB)] = new FuncAdapter[X, (AA, AB)] {
    def runfunc(x: X, f: X => (AA, AB)): (AA, AB) = {
      println("Function from String to tuple")
      f(x)
    }
  }

  def ffuunn[A, B](x: A)(f: A => B)(implicit fa: FuncAdapter[A, B]) = {
    fa.runfunc(x, f)
  }

  println(ffuunn("obaoba") { s => s.length })
  println(ffuunn("obaobaobaobaoba") { s => s.length })
  println(ffuunn("obaoba") { s => (s.length, s.reverse) })
  println(ffuunn("obaobaobaobaoba") { s => (s.length, s.reverse) })
}
//OUTPUT:
//Function of something to plain, non-tuple type
//6
//Function of something to plain, non-tuple type
//15
//Function from String to tuple
//(6,aboabo)
//Function from String to tuple
//(15,aboaboaboaboabo)

Works like a charm. But then I can't adapt it to my real problem... Right now it seems the compiler is not looking for the more specific fooKV implicit, and instead always picks fooAny, and that causes an error when I try to run rreduce, because it is expecting a V <: Nothing. How do I make it work?

Community
  • 1
  • 1
dividebyzero
  • 2,190
  • 1
  • 21
  • 33

1 Answers1

1

I'm not sure I fully understand your question.

As far as choosing fooAny vs fooKV, the instance of Foo must be known and passed appropriately from the site where the types are known. This would be the place where mymap is called. Foo is not passed as a parameter though.

def mymap[FB](ff: A => FB) = innermymap(ff)

You are requiring it be know when innermymap(ff) is called. At this point, type information is lost. The only available instance of Foo is fooAny.

This is actually an example of why a definition like fooAny should not exist. You are defining a valid relationship between any XA and any XB, even if these are in fact just Any. The existence of this definition is causing your code to type check when it should not. This will most likely happen again.

drstevens
  • 2,903
  • 1
  • 21
  • 30
  • Well that makes a lot of sense. Is the type unavailable due to erasure? Should I move the whole type class to the main program then, or is it possible to leave it inside `PipelineBuilder` by using `TypeTags` or something? – dividebyzero May 05 '15 at 20:36
  • scala's inference unfortunately cannot determine the type of a generic parameter that is defined by the generic parameter of the caller. I.e. generic inference will only work one level deep for typeclasses – Arne Claassen May 06 '15 at 00:24
  • @dividebyzero Again, I don't fully understand what you're trying to do, but in if you want to specifically have `fooKV` chosen, you need to replace `mymap` with `innermymap`. The additional type parameter and the implicit parameter for `Foo` in the `innermymap` signature need to be available when you are building your pipeline, the call site when the specific type is known. You are pretending they don't exist and then requiring them. The first thing you should do when trying to figure this out is to remove the `fooAny` definition. – drstevens May 06 '15 at 14:34
  • OK, now I'm importing the `implicit`s on the correct scope, and I think I'm on the right track. One interesting detail is that now I have to send the object as a parameter to the `constructPB` method form the adapter, but I don't suppose there is a better solution to that... Thanks! – dividebyzero May 08 '15 at 00:44