After reading about query optimization techniques I came to know about the below techniques.
1. Indexing - bitmap and BTree
2. Partitioning
3. Bucketing
I got the difference between partitioning and bucketing, and when to use them but I'm still confused how indexes actually work. Where is the metadata for index is stored? Is it the namenode which is storing it? I.e., actually while creating partitions or buckets we can see multiple directories in hdfs which explains the query performance optimization but how to visualize indexes? Are they really used in real life despite partitioning and bucketing being in the picture?
Please help me for the above queries and is there's any dedicated page for hadoop and hive developers community?