4

recently I am reading the source of spark. When reaching to the class of "org.apache.spark.deploy.SparkSubmit", I got confusion about the keyword "self" and the operator "=>". Is anyone can explain me for that?

override def main(args: Array[String]): Unit = {
val submit = new SparkSubmit() {
  self =>

  override protected def parseArguments(args: Array[String]): SparkSubmitArguments = {
    new SparkSubmitArguments(args) {
      override protected def logInfo(msg: => String): Unit = self.logInfo(msg)

      override protected def logWarning(msg: => String): Unit = self.logWarning(msg)
    }
  }

  override protected def logInfo(msg: => String): Unit = printMessage(msg)

  override protected def logWarning(msg: => String): Unit = printMessage(s"Warning: $msg")

  override def doSubmit(args: Array[String]): Unit = {
    try {
      super.doSubmit(args)
    } catch {
      case e: SparkUserAppException =>
        exitFn(e.exitCode)
      case e: SparkException =>
        printErrorAndExit(e.getMessage())
    }
  }

}

BTW: this question is totally different from the “duplicated one”. Although these two are very same, what i am asking is about the “self =>” near the key word of “new class” rather than the “duplicated” with “ some name =>” in the class definition of scala. it’s not a same question

  • Stable identifier for the `this` of anon class implementing `SparkSubmit` – cchantep Aug 26 '18 at 09:22
  • @cchantep so if there is not “self”, is any difference between them? – Cam Heo JianQiao Aug 26 '18 at 09:24
  • This is incorrectly marked as a duplicate. The other question is about declaring self types, whereas there is no self type in this question. – Tim Aug 26 '18 at 09:35
  • This is the answer to the question: Using `self =>` allows the identity of this class to be used where `this` is not available. Specifically, inside the `new SparkSubmitArguments` constructor the `this` pointer refers to the new class not the outer class. By declaring `self` as an alias for the outer class it can be used in the inner class. – Tim Aug 26 '18 at 09:39
  • @Tim do you know how to apply a complaint for the “incorrect duplicated”? – Cam Heo JianQiao Aug 26 '18 at 09:42
  • @CamHeoJianQiao I looked into this and there is no specific mechanism for challenging this, but I have flagged this question as "in need of moderator attention" and explained my reasoning. Since you are the author you can edit the question to point out that this is not a duplicate of the other question. – Tim Aug 26 '18 at 09:49
  • @Tim ok i will do it lately – Cam Heo JianQiao Aug 26 '18 at 09:51
  • @Tim: I have removed the duplicate, but I don't see how this is not a self-type declaration. Can you point me to the section in the Scala Language Specification where this construct is specified? – Jörg W Mittag Aug 26 '18 at 10:55
  • @CamHeoJianQiao: If you want a question reopened, you can vote to reopen it, by clicking on "reopen" (provided you have enough reputation). – Jörg W Mittag Aug 26 '18 at 10:56
  • @Tim: https://scala-lang.org/files/archive/spec/2.13/05-classes-and-objects.html#templates – Jörg W Mittag Aug 26 '18 at 11:05
  • @JörgWMittag It is a self type declaration. But the type is optional, and the question you referenced was about the case where a type *is* supplied whereas this question is about the case where a type *is not* supplied. The specification even describes this as a "different form of self type annotation". Since it is asking about a different form of the annotation, it is a different question and therefore not a duplicate. – Tim Aug 26 '18 at 13:02

2 Answers2

11

The statement

self =>

is called a "self type annotation" and it creates a value named self that refers to the instance of the class being constructed. This can be used in places where the this value for the class is not available. In particular, it can be used inside a nested class, where this refers to the nested class and a reference to the outer class is not automatically available.

In your case, self is used here:

new SparkSubmitArguments(args) {
  override protected def logInfo(msg: => String): Unit = self.logInfo(msg)

  override protected def logWarning(msg: => String): Unit = self.logWarning(msg)
}

This makes the new instance of SparkSubmitArguments use the logInfo and logWaringing methods from the outer, containing class. You can't use this at this point of the code because it would refer to the inner class, not the outer class. (If you do use this here you will get an infinite loop)

Tim
  • 26,753
  • 2
  • 16
  • 29
4

It's an alias for this. This is done to disambiguate self-reference in inner classes.

When you use this in the scope of an inner class, it refers to the inner class' instance. If you need a reference to the outer class, you would need an alias though:

  class Foo { self => 
    val x = 1
    new AnyRef { 
      val x = 2
      println(this.x) // 2
      println(self.x) // 1
    }
  }
Dima
  • 39,570
  • 6
  • 44
  • 70
  • What is the result if the "self =>" is deleted? – Cam Heo JianQiao Aug 26 '18 at 11:33
  • Then lines like this `self.logInfo(msg)` will not compile, because `self` would be undefined. – Dima Aug 26 '18 at 11:38
  • You can easily find out answers to questions like this ("what would happen if ...") yourself BTW. ;) – Dima Aug 26 '18 at 11:44
  • The self type annotation is not useless. Without it you would not be able to call the outer implementation of `logInfo` and `logWarning` from the inner class. If you replace `self` with `this` as you suggest then the `logInfo` and `logWarning` in the inner class would be calling themselves. – Tim Aug 26 '18 at 13:22
  • @Tim, oh, you are right! I didn't notice that there are two inner classes there – Dima Aug 26 '18 at 13:29
  • Simple, clear, concise. If only all Stackoverflow answers can be like this. – Ted Aug 18 '20 at 01:59