1

I've been trying to do a check for result dataset in spark of whether it is empty or has data. I did below following things.

dataset.rdd().isEmpty();

2.

try{
           dataset.head(1)
         }catch(Exception e){
          status ="No data";
          }

3.

try{
         dataset.first();
          }catch(Exception e){  
           status ="No data";
          }

4.

dataset.limit(1).count()>0;

All this are taking a lot of time to complete when comparatively huge data is present. I need to get a efficient solution for this.

Alper t. Turker
  • 34,230
  • 9
  • 83
  • 115
Garry Steve
  • 129
  • 2
  • 11
  • Yeah . @philantrovert but That's what I have said in the Question description. That I have tried all those?? But evrything is taking a lot of time. when dataset is not empty. – Garry Steve Jun 08 '18 at 06:16
  • These are the options you get. If `dataset` has complex wide dependencies, then taking even a single element will be expensive. – Alper t. Turker Jun 08 '18 at 09:35

0 Answers0