Trying to improve the hive query speed based on the techniques. Below config changes increases speed and want to use these settings for all the queries i execute. But i wanted some input on if these settings will impact inversely if used across all queries.
set hive.vectorized.execution.enabled = true;
set hive.vectorized.execution.reduce.enabled = true;
Vectorized query execution improves performance of operations like scans, aggregations, filters and joins, by performing them in batches of 1024 rows at once instead of single row each time. Introduced in Hive 0.13, this feature significantly improves query execution time.
set hive.cbo.enable=true;
set hive.compute.query.using.stats=true;
set hive.stats.fetch.column.stats=true;
set hive.stats.fetch.partition.stats=true;
analyze table tweets compute statistics for columns;
Enable cost based optimization(cbo)
set hive.execution.engine=tez;
use tez engine