The way how indexing in a database works: refering the answer from Xenph Yan
Creating an index on a field in a table creates another data structure which holds the field value, and pointer to the record it relates to. This index structure is then sorted, allowing Binary Searches to be performed on it.
The way I understood ORC indexing is, that ORC keeps statistics (min, max, sum) about the rows every 10'000 rows (by default )and if I query the data it looks at the statistics to figure out if it needs to read the row chunk or not.
So is it correct that ORC indexing does not sort the data?
I have a 69 column large table with very unstructured data and I would like to be able to perform ad-hoc queries on every column. To do so, I would like to be able to sort every column through an index (or at least most of them). There is no 'key' column in the data that get's queried rapidly.