1

I'm new to Hive and I wanted to know the list of table properties to increase the performance of the insert overwrite in hive managed table. Can someone help with that?

leftjoin
  • 36,950
  • 8
  • 57
  • 116
Shadab Hussain
  • 794
  • 6
  • 24

1 Answers1

3

Some suggestions:

  1. Switch-off statistics auto-gathering:

    set hive.stats.autogather=false;

  2. Remove partitions folders or table folder in advance if possible, or use PURGE option: https://stackoverflow.com/a/39623927/2700344

  3. If you are using S3 and table is ORC, disable block-padding:

    ALTER TABLE your_table SET TBLPROPERTIES ("orc.block.padding"="false", "orc.block.padding.tolerance"="1.0");

  4. Use vectorization ConfigurationProperties-Vectorization and Tez:

    set hive.execution.engine=tez;

  5. Optimize query.

leftjoin
  • 36,950
  • 8
  • 57
  • 116