Continuing from question: What is the difference between partitioning and bucketing a table in Hive ?
Suppose we have partitioned the employee table by Salary column. If we write a select query from this table with a condition in where clause that is based on salary column, then the query would run fast.. as it would only fetch details from a particular partition only.
However, rather than partitioning, if we bucket the same table based on salary column and set fixed buckets numbers. Then in this case if we write the same query, I would like to know how would that query benefit from buckets?? Can anyone please explain?