0

When should I go for bucketing? When it is low cardinality or high cardinality?

Techopedia explains Cardinality

High cardinality columns are those with very unique or uncommon data values. For example, in a database table that stores bank account numbers, the “Account Number” column should have very high cardinality

Low cardinality columns are those with very few unique values. In a customer table, a low cardinality column would be the “Gender” column. This column will likely only have “M” and “F” as the range of values to choose from, and all the thousands or millions of records in the table can only pick one of these two values for this column.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Jack
  • 1
  • 1
  • 1
  • 1
  • did you see this https://stackoverflow.com/questions/19128940/what-is-the-difference-between-partitioning-and-bucketing-a-table-in-hive? – hlagos Jun 15 '17 at 03:44
  • Possible duplicate of [Hive - Bucketing and Partitioning](https://stackoverflow.com/questions/34096470/hive-bucketing-and-partitioning) – Cœur Apr 15 '18 at 10:37

0 Answers0