0

I'm currently designing a relational database table in MySQL for handling multiple categories, representing them later in a tree structure on the client side and filtering on them. Here is a picture of how the structure looks like: enter image description here

So we have a root element which is set by default. We can after that add children to it (Level one). So far a table structure in the simplest case could be defined so:

| id |     name     | parent_id |
--------------------------------
  1   All Categories    NULL
  2      History         1

However, I have a requirement that I need to include another tree structure type (Products) in the table (a corresponding API is available). The records from the other table have their own id types (UUID). Basically I need to ingest them in my table. A possible structure will look like so:

| id |   UUID     |        name       |        parent_id   |
----------------------------------------------------------
  1      NULL        All Categories                NULL
  2      NULL          History                      1
  3      NULL          Products                     1
  4   CN1001231232  Catalog electricity             3
  5   CN1001231242  Catalog basic components        4
  6      NULL          Shipping                     1 

I am new to relational databases, but all of these possible NULL values for the UUID indicate (at least for me) to be bad design of database table. Is there a way of avoiding this, or even better way for this "ingestion"?

CodiClone
  • 141
  • 1
  • 2
  • 8

3 Answers3

1

If you had a table for users, with columns first_name, middle_name, last_name but then a user signed up and said they have no middle name, you could just store NULL for that user's middle_name column. What's bad design about that?

NULL is used when an attribute is unknown or inapplicable on a given row. It seems appropriate for the case you describe, i.e. when records that did not come from the external source have no UUID, and need no UUID.

That said, some computer science theorists insist that NULL is never appropriate. There's a decades-old controversy about whether SQL should even have a NULL.

The alternative would be to create a second table, in which you store only the UUID and the reference to the entity in your first table. Then just don't store rows for the other id's.

| id |   UUID     |
-------------------
  4   CN1001231232
  5   CN1001231242

And don't store the UUID column in your first table. This eliminates the NULLs, but it means you need to do a JOIN of the two tables whenever you want to query the entities with their UUID's.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • I understand. But whenever there is a change with the categories in the other table (let's say a category is removed), I need to reset my table and ingest again? – CodiClone Feb 17 '21 at 13:58
  • 1
    I'd just use a nullable attribute. I'm not in the camp that eschews NULLs. – Bill Karwin Feb 17 '21 at 14:14
  • That was why asked this question. In the university they explain that we must always find a way to reduce the NULL values in our table and this led me to spent some think on whether my suggested way is optimal or not ... – CodiClone Feb 17 '21 at 14:47
  • 1
    It would be appropriate to ask them why they think one must always find a way to reduce NULLs, and what they propose to do instead. It's true that NULL makes certain logical expressions more complex. But compare that to the use of zero in arithmetic. You can't divide by zero — so don't do that. We don't say, "always reduce any occurrence of zeroes." – Bill Karwin Feb 17 '21 at 16:09
1

First make sure you actually have to combine these in the same table. Are the products categories? If they are categories and are used like categories then it makes sense to have them in the same table, but if they have categories then they should be kept separate and given a category/parent id.

If you're sure it's appropriate to store them in the same table then the way you have it is good with one adjustment. For the UUID you can use a separate naming scheme that makes it interchangeable with id for those entries and avoids collisions with the other uuids. For example:

| id |   UUID     |        name       |        parent_id   |
----------------------------------------------------------
  1   CAT000000001   All Categories                NULL
  2   CAT000000002     History                      1
  3   CAT000000003     Products                     1
  4   CN1001231232  Catalog electricity             3
  5   CN1001231242  Catalog basic components        4
  6   CAT000000006     Shipping                     1 
  • The products are actually categories. Basically I need to mix the already defined categories with the other which can be defined in my client side (products part of the table won't be changed in my side). – CodiClone Feb 17 '21 at 14:00
1

Your requirements combine the two things relational database are not great with out of the box: modelling hierarchies, and inheritance (in the object-oriented sense).

Your design users the "single table inheritance" model (one of 3 competing options). It's the simplest option in terms of design.

In practical terms, you may want to add a column to explicitly state which type of record you're dealing with ("regular category" and "product category") so your queries are more obvious to others.

Neville Kuyt
  • 29,247
  • 1
  • 37
  • 52
  • You're right, but me was given the requirement that all needs to be done with relation databases. Thanks for the feedback, this column will be very useful. – CodiClone Feb 17 '21 at 14:04