SparkSQL attempts to read data from non-existing path

Asked Oct 31 '18 at 14:29

Active Dec 20 '18 at 02:24

Viewed 258 times

I am having an issue with pyspark sql module. I created a partitioned table and saved it as parquet file into hive table by running spark job after multiple transformations.

Data load is successful into hive and also able to query the data. But when I try to query the same data from spark it says file path doesn't exist.

java.io.FileNotFoundException: File hdfs://localhost:8020/data/path/of/partition partition=15f244ee8f48a2f98539d9d319d49d9c does not exist

The partition which is mentioned in above error was the old partitioned column data which doesn't even exist now.

I have run the spark job which populates a new partition value. I searched for solutions but all I can see is people say there was no issue in spark version 1.4 and there is an issue in 1.6

Can someone please suggest me the solution for this problem.

edited Dec 20 '18 at 02:24

asked Oct 31 '18 at 14:29

Abhishek Allamsetty

1

Please show the Spark code – OneCricketeer Oct 31 '18 at 14:32
Can you restart the server and try again? – pvy4917 Oct 31 '18 at 15:46
1

Command "MSCK" can help: https://stackoverflow.com/questions/21108251/how-to-update-partition-metadata-in-hive-when-partition-data-is-manualy-delete?rq=1 – pasha701 Nov 01 '18 at 14:31

SparkSQL attempts to read data from non-existing path

0 Answers0