How to check if DataFrame(Scala) is empty in fastest way?I use DF.limit(1).rdd.isEmpty, faster than DF.rdd.isEmpty,but not ideal.Is there any better way to do that?
Asked
Active
Viewed 8,794 times
3
-
1Does this answer your question? [How to check if spark dataframe is empty?](https://stackoverflow.com/questions/32707620/how-to-check-if-spark-dataframe-is-empty) – user3370741 Aug 03 '21 at 16:49
1 Answers
3
I usually wrap a call to first
around a Try
:
import scala.util.Try
val t = Try(df.first)
From there you can match on it if it's a Success
or Failure
to control logic:
import scala.util.{Success,Failure}
t match {
case Success(df) => //do stuff with the dataframe
case Failure(e) =>
// dataframe is empty; do other stuff
//e.getMessage will return the exception message
}

Ton Torres
- 1,509
- 13
- 24
-
-
oh,I'm sorry,I test df.first.if it's empty,it occurs this error---java.util.NoSuchElementException: next on empty iterator – yjxyjx May 03 '16 at 12:00
-
Oops, my mistake; I meant `Try` instead of `Option`. I've updated my answer. – Ton Torres May 05 '16 at 00:31
-
-
Looking at the source code of `limit` and `head` it looks like `head` calls `limit(1)`, so If ever there was a difference I doubt it would be anything significant in this instance.Still, `df.head` is cleaner and easier to understand (for me, at least). – Ton Torres May 06 '16 at 00:19