I'm facing an issue where I see the following error message - basically around a null
:
An error occurred while calling o4013.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 275.0 failed 4 times, most recent failure: Lost task 0.3 in stage 275.0 (TID 415, w-----pp.net, executor 1): scala.MatchError: null (of class org.json.JSONObject$Null)
So what I'm doing is first gathering data from my DB - its a object hence the long select:
myData = results.select("music.metadata.artist.*")
then:
print(myData.select("*").show())
Based on that error I'm assuming there is some null
data coming in, so to remove it I tried placing the following line before I do the show()
myData.na.drop()
However that doesn't help and I continue getting the same error.
Other than that, how can I precisely see what data I have incoming when I set myData
?
Otherwise, am I actually on the right track based on that error message?
Any help/ideas would be appreciated.
Thanks.