7

I was wondering if we can print the definition of a function in Scala. A function is treated as an object in Scala.

For example:

scala> val splitFunction = (value : String) => { value.split(" ")}

splitFunction: String => Array[String] = <function1>

Above, Scala interactive shell indicates that splitFunction has input parameter String and it returns Array of strings. What really function1 indicates here?

Is it possible to print or retrieve the definition of splitFunction?

We can achieve same in python: Is it possible to print a function as a string in Python?

Update: In Apache Spark, RDD lineage or DAG stores information about parent RDD and transformation at each stage. I am interested in fetching definition of a function (even lambda or anonymous functions) used as an argument to transformations such as flatMap or map.

For example: File - DebugTest.scala

val dataRDD = sc.textFile( "README.md" )
val splitFunction = (value : String) => {value.split(" ")}
val mapRDD = dataRDD.map(splitFunction )

println(mapRDD.toDebugString)

Output:

(1) MapPartitionsRDD[2] at map at DebugTest.scala:43 []
 |  README.md MapPartitionsRDD[1] at textFile at DebugTest.scala:41 []
 |  README.md HadoopRDD[0] at textFile at DebugTest.scala:41 []

From above output, I can understand what transformations are performed but cannot understand or retrieve definition for the splitFunction used as an argument in "map" transformation. Is there any way to retrieve or print it?

aagora
  • 125
  • 2
  • 6

1 Answers1

8

No. (In general)

The reason it says <function1> is because there is no good string representation of a function to be given, so we just say that it's some function that takes one argument.

The reason you can't get the definition of a function is because Scala is compiled. The JVM bytecode for your function is already rather unreadable (comments mine, of course)

aload_1           // Load the second argument (of reference type) onto the stack (the first is this function object)
checkcast #mm     // Where mm is an index into the class constant pool, representing a reference to the class String (cast inserted because this is a generic method)
ldc #nn           // Where nn is an index representing the string " "
invokevirtual #oo // Where oo is another index, this time representing String::split
areturn           // Return whatever's left on the stack

A "show function implementation" function would have to 1) retrieve the implementation from the JVM (I believe there's an API for that, but it's meant for debuggers), and then 2) decompile the code to Scala/Java. It's not impossible, per se, but I don't think anyone's done it (and really, why would you do it?)

Now, it would be possible for every Scala anonymous function to just store its code and override toString to output it, but, again, there's simply no reason to do it. Even if you want the implementation for debugging purposes, chances are you have the source code and you can use the line numbers in class file to jump to it, and if you want to store it, it's already stored in the class file.

If you really want it, it is possible, in theory, to define a (opt-in) macro (or even a compiler plugin)

import language.experimental.macros
import reflect.macros.whitebox.Context
def stringF(f: Any): Any = macro stringF_impl
def stringF_impl(c: whitebox.Context)(f: c.Tree): c.Tree = ???

That turns

stringF { arg => body } // BTW: say bye to type inference

into

new Function1[A, R] {
  override def apply(arg: A) = body
  override def toString() = "{ arg => body }" // This part is not there normally
}

But, again, I haven't heard of anyone doing it, and there's just no strong reason to try.

HTNW
  • 27,182
  • 1
  • 32
  • 60
  • Thank you for the detailed response. Appreciate it. I have updated my question with the goal and additional information. Let me know if you have any solutions in mind to achieve the same. – aagora Oct 11 '17 at 03:14
  • 3
    I strongly disagree with your reasoning. Scala isn't "compiled". Scala is a programming language. There is no such thing as a "compiled programming language". Programming languages aren't compiled or interpreted. Programming languages just *are*. Compilation and interpretation are traits of a compiler or interpreter (duh), not the language. Every language can be compiled and every language can be interpreted. The fact that Scala happens to have some compiled implementations has nothing to do with whether or not you can get access to the source code of functions. You can store the source code … – Jörg W Mittag Oct 16 '17 at 19:06
  • 1
    … of a function regardless of whether you use an interpreter or a compiler to store the source code. If the Scala Language Specification required functions to have `toSource` method, then the compiler would simply generate an anonymous class with a `toSource` method which contains the source code of the function as a static string constant. Case in point: all currently existing mainstream ECMAScript implementations are compiled, yet, all of them can print the source code of functions, simply because the ECMA-262 specification says that they have to. – Jörg W Mittag Oct 16 '17 at 19:08
  • The reason why Scala cannot print the source code of a function is that it doesn't store it. The reason it doesn't store it is because it doesn't have to. This has nothing to do with compilation or interpretation. – Jörg W Mittag Oct 16 '17 at 19:09
  • I got this error when trying the stringF method: :13: error: not found: value stringF_impl def stringF(f: Any): Any = macro stringF_impl – whatsnext Oct 29 '18 at 23:02