I went throguh the link What's the difference between RDD and Dataframe in Spark?
Is it mandatory to create RDD for doing the operation, we can start working with data-frame. is there any advantage for RDD over Dataframe
Can we run Pandas,numpy data-frame functionality on spark. For numpy the np.where and for pandas like df.groupby[''].agg()