I have hive table partitioned on date and hour column. when I load the data i will create 24 files. I want merge this 24files in one file. Can anyone suggest me the solution
Asked
Active
Viewed 1,064 times
0
-
Please see this answer: https://stackoverflow.com/a/45266244/2700344 – leftjoin Dec 22 '17 at 11:50
1 Answers
2
Well if you want to have one single file while inserting the data to your partitioned file then you can have your hive as follows:
- partitioned on date
- bucketed on any one column and have only 1 bucket.
Since your bucket number is 1, all your data will be in the file after insert.
Another way is to merge it using the hdfs commands like below
hadoop fs -cat hive_table_data_folder/p* > new_file_name

Tanner Babcock
- 3,232
- 6
- 21
- 23

tata tejasvi
- 21
- 1