In my code, I have a sequence of dataframes where I want to filter out the dataframe's which are empty. I'm doing something like:
Seq(df1, df2).map(df => df.count() > 0)
However, this is taking extremely long and is consuming around 7 minutes for approximately 2 dataframe's of 100k rows each.
My question: Why is Spark's implementation of count() is slow. Is there a work-around?