4

The Play framework has play.api.libs.Files.TemporaryFile that holds a reference to a File, and deletes it in the TemporaryFile#finalize().

case class TemporaryFile(file: File) {

  def clean(): Boolean = {
    file.delete()
  }

  def moveTo(to: File, replace: Boolean = false) {
    Files.moveFile(file, to, replace = replace)
  }

  override def finalize {
    clean()
  }

}

I know that there are some issues with this, for example, you could fill up the entire disk without the JVM feeling the need to GC.


But here I ask about the "correctness" of a program, i.e. a program with no limit on disk space.

def foo() {
    val tempFile = TemporaryFile(new File("/tmp/foo"))

    val inputStream = new FileInputStream(tempFile.file) // last use
    try {
        println(inputStream.read())
    } finally {
        inputStream.close()
    }
}

Could /foo/bar be deleted before I've read from the file? I don't use tempFile after // last use, so could it be finalized immediately after that?

Or what if it is passed as an argument to a function?

def foo() {
  val tempFile = TemporaryFile(new File("/tmp/foo"))
  bar(tempFile)
}

def bar(tempFile: TemporaryFile) {
  val inputStream = new FileInputStream(tempFile.file) // last use
  try {
      println(inputStream.read())
  } finally {
      inputStream.close()
  }
}

If in the example above, tempFile may be removed before I am done using it, what is the correct use of TemporaryFile so that this does not happen?

Paul Draper
  • 78,542
  • 46
  • 206
  • 285
  • Note that there is no relation between reading from the file and your `File` instance other than the name to identify the file system resource. – Sotirios Delimanolis Sep 07 '14 at 21:48
  • @SotiriosDelimanolis, correct. – Paul Draper Sep 07 '14 at 21:57
  • 1
    Good question. And none of the answers address the issues that could happen when processing the `File` reference in an async Future block while the TemporaryFile object has already been collected... I suspect this to produce a bug on my app! – Sebastien Lorber Jan 13 '15 at 15:52

2 Answers2

5

Java objects are eligible for garbage collection once you no longer has a strong reference to the object. This is not dependent on if you "use" the object or not.

In this example,

def foo() {
    val tempFile = TemporaryFile(new File("/tmp/foo"))

    val inputStream = new FileInputStream(tempFile.file) // last use
    try {
        println(inputStream.read())
    } finally {
        inputStream.close()
    }
}

tempFile is not eligible for garbage collection, and therefore finalization, until foo() is is no longer used. It's possible that objects that use members from tempFile may use it, and keep it ineligible longer than the last use inside foo().

In this example,

def foo() {
  val tempFile = TemporaryFile(new File("/tmp/foo"))
  bar(tempFile)
}

def bar(tempFile: TemporaryFile) {
  val inputStream = new FileInputStream(tempFile.file) // last use
  try {
      println(inputStream.read())
  } finally {
      inputStream.close()
  }
}

The result is the same.

In a minor variant (Java, I don't know Scala syntax well),

class Foo {
    List<Object> objects = new List<Object>(); 
    void foo(Object o) { 
        objects.add(o); 
    }
}

// ...

Foo f = new Foo(); 
f.foo(new Object()); // The object we just created is not eligible for garbage 
                     // collection until the `Foo f` is not used, because
                     // it holds a strong reference to the object. 
jdphenix
  • 15,022
  • 3
  • 41
  • 74
  • @Dici http://stackoverflow.com/questions/9809074/java-difference-between-strong-soft-weak-phantom-reference – Sotirios Delimanolis Sep 07 '14 at 21:54
  • @Dici I mean a reference that is your standard fare, no frills reference. A weak reference is one that does _not_ stop the GC from doing it's business. See http://docs.oracle.com/javase/8/docs/api/java/lang/ref/WeakReference.html – jdphenix Sep 07 '14 at 21:54
  • 1
    There are 4 kind of reference in Java, strong, soft, weak and Phantom. Further read :https://www.rallydev.com/community/engineering/java-references-strong-soft-weak-phantom. Basically strong prevent object from being GC – Michal Gruca Sep 07 '14 at 21:59
  • 1
    _tempFile is not eligible for garbage collection, and therefore finalization, until foo() has returned_ is wrong. Since `tempFile` is not used after the access to its `file` field, the object it references immediately becomes eligible after that access. – Sotirios Delimanolis Sep 07 '14 at 22:03
  • @SotiriosDelimanolis I don't understand, `tempFile` is reachable until the method returns. – jdphenix Sep 07 '14 at 22:07
  • No, careful with the definition of _reachable_. [`A reachable object is any object that can be accessed in any potential continuing computation from any live thread.`](http://docs.oracle.com/javase/specs/jls/se8/html/jls-12.html#jls-12.6.1) After `tempFile.file`, the object referenced by `tempFile` is no longer reachable by any computation in the code above. – Sotirios Delimanolis Sep 07 '14 at 22:09
  • The javadoc says that `A newly-created object is strongly reachable by the thread that created it.`. Isn't `tempFile` a newly created object during all the execution of foo ? – Dici Sep 07 '14 at 22:18
  • Which javadoc? The term _newly created_ simply means that it's a new object. It's not a concept that spans over time. – Sotirios Delimanolis Sep 07 '14 at 22:28
  • In your quotation, the tricky word is **potential**. You can tell just by looking that `tempFile` won't be used again after some point, but does the GC knows it too ? I think it would make sense to say that any local variable will **potentially** be involved in a computation since is is accessible to the programer. Note that I'm careful with my assertions since I don't know much about it, I'm just discuting my intuitive point of view. – Dici Sep 07 '14 at 22:35
  • @dici Please reply with `@username`. That javadoc entry is simply stating that [Threads are GC roots](http://stackoverflow.com/questions/6366211/what-are-the-roots). – Sotirios Delimanolis Sep 07 '14 at 22:41
  • @Dici Yes, the GC knows it too. The VM has access to the byte code and can tell that the only reference to that object is `tempFile` and `tempFile` is not used in the code after a specific line. If execution has passed that line, then it becomes eligible. – Sotirios Delimanolis Sep 07 '14 at 22:43
  • @SotiriosDelimanolis Interesting - you are of course speaking of the difference between the programmer _changing_ the method (say, appending a use of `tempFile`) versus the runtime behavior of the GC per specifications. The OPs question is about the latter. Thanks for the info. – jdphenix Sep 07 '14 at 22:49
2

A local variable won't be finalized before the execution exits its scope. The GC doesn't finalize the objects that you don't use, it finalizes the objects you cannot access anymore, so in your example with tempFile, it won't happen before the foo() call is finished.

Dici
  • 25,226
  • 7
  • 41
  • 82
  • Scope has very little to do with GC. Scope defines where you can use the name of some entity. An object referenced by a local variable can be GC'ed even if the method (or other scope) containing the local variable hasn't ended. – Sotirios Delimanolis Sep 07 '14 at 22:01
  • I meant that a local variable cannot die **before** the execution exits its scope, because in that case it is clearly accessible. However, I do know that an object created locally can survive after the method containing it is ended. Is it clearer, or should I correct my answer ? – Dici Sep 07 '14 at 22:05
  • I'm saying the objects referenced by local variables **can** die before execution exits their (local variable) scope. [Scope](http://docs.oracle.com/javase/specs/jls/se8/html/jls-6.html#jls-6.3) only deals with the validity of the use of a name in a program. – Sotirios Delimanolis Sep 07 '14 at 22:07