1

I do not know how to explain this, but Spark seems to add a hidden (implicit?) parameter to constructor. Here is code I tried in spark-shell (in regular Scala shell parameters list would be empty):

scala> class A {}
defined class A

scala> classOf[A].getConstructors()(0).getAnnotatedParameterTypes
res0: Array[java.lang.reflect.AnnotatedType] = Array(sun.reflect.annotation.AnnotatedTypeFactory$AnnotatedTypeBaseImpl@5ed65e4b)

Because of this parameter I cannot pass my custom InputFormat class to Spark's hadoopFile function. Any hints on what's going on here or at least how can I create class with parameter-less constructor?

Dmytro Mitin
  • 48,194
  • 3
  • 28
  • 66
abufct
  • 71
  • 4
  • Yes, thanks! That was 2.5 years ago, so I cannot say why the behavior was different in the vanilla Scala shell, though. :-) – abufct Dec 01 '20 at 22:42
  • Right. In May 2018 this [could be](https://en.wikipedia.org/wiki/Scala_(programming_language)#Versions) a version 2.12.x. See update regarding 2.12.6. – Dmytro Mitin Dec 02 '20 at 04:42

1 Answers1

3

The behavior seems to be the same as in ordinary Scala REPL

$ scala
Welcome to Scala 2.13.3 (Java HotSpot(TM) 64-Bit GraalVM EE 19.3.0, Java 1.8.0_231).
Type in expressions for evaluation. Or try :help.

scala> class A {}
class A

scala> classOf[A].getConstructors()(0).getAnnotatedParameterTypes
val res0: Array[java.lang.reflect.AnnotatedType] = Array(sun.reflect.annotation.AnnotatedTypeFactory$AnnotatedTypeBaseImpl@383864d5)

scala> classOf[A].getConstructors()(0).getParameters
val res1: Array[java.lang.reflect.Parameter] = Array(final $iw $outer)

REPL makes the class nested (every line in REPL is an instantiation of the outer class). This adds an instance of the outer class as a parameter to the constructor ($outer is the name of parameter, $iw is the outer class). You can reproduce this behavior as follows

class X {
  class A {}
}

object App {
  def main(args: Array[String]): Unit = {
    val x = new X

    println(classOf[x.A].getConstructors()(0).getAnnotatedParameterTypes.mkString(","))
    // sun.reflect.annotation.AnnotatedTypeFactory$AnnotatedTypeBaseImpl@2f7c7260

    println(classOf[x.A].getConstructors()(0).getParameters.mkString(","))
    // final X $outer
  }
}

If you run REPL with compiler option -Xprint:typer switched on (like scala -Xprint:typer or spark-shell -Xprint:typer) you'll see

$ scala -Xprint:typer
Welcome to Scala 2.13.3 (Java HotSpot(TM) 64-Bit GraalVM EE 19.3.0, Java 1.8.0_231).
Type in expressions for evaluation. Or try :help.

scala> class A
[[syntax trees at end of                     typer]] // <console>
package $line3 {
  sealed class $read extends AnyRef with Serializable {
    def <init>(): $line3.$read = {
      $read.super.<init>();
      ()
    };
    sealed class $iw extends AnyRef with java.io.Serializable {
      def <init>(): $iw = {
        $iw.super.<init>();
        ()
      };
      class A extends scala.AnyRef {
        def <init>(): A = {
          A.super.<init>();
          ()
        }
      }
    };
    private[this] val $iw: $iw = new $read.this.$iw();
    <stable> <accessor> def $iw: $iw = $read.this.$iw
  };
  object $read extends scala.AnyRef with java.io.Serializable {
    def <init>(): type = {
      $read.super.<init>();
      ()
    };
    private[this] val INSTANCE: $line3.$read = new $read();
    <stable> <accessor> def INSTANCE: $line3.$read = $read.this.INSTANCE;
    <synthetic> private def writeReplace(): Object = new scala.runtime.ModuleSerializationProxy(classOf[$line3.$read$])
  }
}

class A

So this additional constructor parameter $outer can be obtained as $line3.$read.INSTANCE.$iw

scala> classOf[A].getConstructors()(0).newInstance($line3.$read.INSTANCE.$iw)

...

val res0: Object = A@282ffbf5

Be careful, the encoding can change in a different version of Scala. For example spark-shell from Spark 3.0.1 (pre-built for Hadoop 3.2) uses Scala 2.12.10 and there $lineXXX.$read.INSTANCE.$iw.$iw should be instead of $lineXXX.$read.INSTANCE.$iw

$ spark-shell -Xprint:typer
20/11/25 16:32:16 WARN Utils: Your hostname, dmitin-HP-Pavilion-Laptop resolves to a loopback address: 127.0.1.1; using 192.168.0.103 instead (on interface wlo1)
20/11/25 16:32:16 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
20/11/25 16:32:16 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://192.168.0.103:4040
Spark context available as 'sc' (master = local[*], app id = local-1606314741512).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.0.1
      /_/
         
Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit GraalVM EE 19.3.0, Java 1.8.0_231)
Type in expressions to have them evaluated.
Type :help for more information.

scala> class A
[[syntax trees at end of                     typer]] // <console>
package $line14 {
  sealed class $read extends AnyRef with java.io.Serializable {
    def <init>(): $line14.$read = {
      $read.super.<init>();
      ()
    };
    sealed class $iw extends AnyRef with java.io.Serializable {
      def <init>(): $read.this.$iw = {
        $iw.super.<init>();
        ()
      };
      sealed class $iw extends AnyRef with java.io.Serializable {
        def <init>(): $iw = {
          $iw.super.<init>();
          ()
        };
        class A extends scala.AnyRef {
          def <init>(): A = {
            A.super.<init>();
            ()
          }
        }
      };
      private[this] val $iw: $iw = new $iw.this.$iw();
      <stable> <accessor> def $iw: $iw = $iw.this.$iw
    };
    private[this] val $iw: $read.this.$iw = new $read.this.$iw();
    <stable> <accessor> def $iw: $read.this.$iw = $read.this.$iw
  };
  object $read extends scala.AnyRef with Serializable {
    def <init>(): $line14.$read.type = {
      $read.super.<init>();
      ()
    };
    private[this] val INSTANCE: $line14.$read = new $read();
    <stable> <accessor> def INSTANCE: $line14.$read = $read.this.INSTANCE;
    <synthetic> private def readResolve(): Object = $line14.$read
  }
}

defined class A

scala> classOf[A].getConstructors()(0).newInstance($line14.$read.INSTANCE.$iw.$iw)

...

res0: Any = A@6621ab0c

In Scala 2.12.6 scala -Xprint:typer produces

$ ./scala -Xprint:typer
Welcome to Scala 2.12.6 (Java HotSpot(TM) 64-Bit GraalVM EE 19.3.0, Java 1.8.0_231).
Type in expressions for evaluation. Or try :help.

scala> class A
[[syntax trees at end of                     typer]] // <console>
package $line3 {
  object $read extends scala.AnyRef {
    def <init>(): $line3.$read.type = {
      $read.super.<init>();
      ()
    };
    object $iw extends scala.AnyRef {
      def <init>(): type = {
        $iw.super.<init>();
        ()
      };
      object $iw extends scala.AnyRef {
        def <init>(): type = {
          $iw.super.<init>();
          ()
        };
        class A extends scala.AnyRef {
          def <init>(): A = {
            A.super.<init>();
            ()
          }
        }
      }
    }
  }
}

defined class A

So now the class A is nested inside an object ($line3.$read.$iw.$iw) rather than class and in such case additional parameter is not added to the constructor of A

object X {
  class A {}
}

object App {
  def main(args: Array[String]): Unit = {
    val x = X

    println(classOf[x.A].getConstructors()(0).getAnnotatedParameterTypes.toList)
    // List()

    println(classOf[x.A].getConstructors()(0).getParameters.toList)
    // List()
  }
}
Dmytro Mitin
  • 48,194
  • 3
  • 28
  • 66