10

I have a metric table that I expect to be very large. It has a polymorphic association so that it can belongs_to other models that want to record some metric. I typically index association columns like this to speed up association loading. I've heard people talking about joint-indexing this association. This looks like:

add_index :comments, [:commentable_type, :commentable_id]

But I've also heard counsel against creating indexes of low-cardinality, because the payoff of the index doesn't offset the overhead of maintaining it. Since the _type half of my polymorphic association will probably only have 4-5 values across the millions of rows, I'm inclined to only index on the _id portion of the polymorphic association. I will probably create some additional joint-indexes using the _id column and some other unmentioned integer and datetime columns, but I won't include the _type in these indexes either.

Is this what you would do/recommend?

jmaxyz
  • 928
  • 1
  • 8
  • 19

2 Answers2

5

Ultimately, this is worth benchmarking before and after adding the index, on a realistic data set - realistic in size and data.

However, you're not creating an index on a field with just a few values. The index is on the combination of the two fields, which is likely to have a lot of different value combinations. The index on the combined fields is a smart idea.

Jaime Bellmyer
  • 23,051
  • 7
  • 53
  • 50
3

It is best practice to put the most selective field first when creating an index on multiple fields. Since you only have 4-5 values of commentable_type, you would be better off doing:

add_index :comments, [:commentable_id, :commentable_type]
Felix
  • 674
  • 5
  • 16
  • 1
    can you clarify what you mean by "most selective"? – GMA Feb 22 '15 at 16:02
  • @GMA it's very probable that you'll have tons of each type records, but number of types will be pretty limited, meaning there are only handful of rows with same commentable_id value, while thousands and thousands of rows with same commentable_type, which means that commentable_id is more selective – Dr.Strangelove Mar 09 '18 at 12:59