I'm new to Hive and I wanted to know the list of table properties to increase the performance of the insert overwrite in hive managed table. Can someone help with that?
Asked
Active
Viewed 1,171 times
1 Answers
3
Some suggestions:
Switch-off statistics auto-gathering:
set hive.stats.autogather=false;
Remove partitions folders or table folder in advance if possible, or use PURGE option: https://stackoverflow.com/a/39623927/2700344
If you are using S3 and table is ORC, disable block-padding:
ALTER TABLE your_table SET TBLPROPERTIES ("orc.block.padding"="false", "orc.block.padding.tolerance"="1.0");
Use vectorization ConfigurationProperties-Vectorization and Tez:
set hive.execution.engine=tez;
Optimize query.

leftjoin
- 36,950
- 8
- 57
- 116