2

Problem statement

I've a big table users composed of 2 different kinds of users, A and B. At the moment, they share some fields, while other fields are A-only or B-only.

Now, normally with a proper normalization on this table, we can end up having 1 shared table with 2 related tables with the A-fields and B-fields. However, my goal is to reduce the space but also keeping the same performances if possible.

My solution

My solution would be using a JSON type to store those fields that are not shared between the 2 different user types but at the same time creating generated and indexed virtual columns for those fields that need to be indexed. It should save space because MySQL won't allocate space for not used fields and at the same time it keeps more or less the same performances thanks to the virtual columns + indexes. Moreover, these fields won't be used as foreign keys so the virtual column is not a limitation for me.

Question

Looking around it seems that people don't suggest using the JSON type to avoid wasting space with not used fields. Instead, using normal fields with separate tables is still strongly suggested. Do you think I'm approaching the wrong way using JSON instead of the actual fields? if yes, why?

Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185
Joseph
  • 1,029
  • 13
  • 26
  • There are too many variations on what can be going on. If you want to discuss further, please add `SHOW CREATE TABLE` and indicate which columns are of which types. Also, are there a lot more As than Bs (or vice versa)? Datatypes are important; actual column names would be useful, but we might still be able to help even without understanding the app space. – Rick James Sep 29 '20 at 17:54
  • You talk about "saving space". How big? Big columns? A billion rows? Tiny RAM? That is, is the space problem "real"? – Rick James Sep 29 '20 at 17:56
  • You have only A and B. JSON shines when there are dozens or hundreds of possible columns (think 'product' attributes) and most are usually missing (a 'dress' does not have a 'voltage'). – Rick James Sep 29 '20 at 17:58
  • Unfortunately, I cannot share information about the db scheme. However, at the moment the A fields are dozens and the B fields too, those fields will increase overtime since more and more information about these 2 kind of users will be added. They are both TEXT and INT and the users inside this table are bilions. So let's consider the worst case where these set of fields are really consuming lot of space – Joseph Sep 29 '20 at 20:01
  • Instead of adding columns, consider using two--A and B--JSON columns. – Rick James Sep 29 '20 at 20:42

0 Answers0