1

I am trying to normalise my MySQL 5.7 data shema and strugle with replacing the SQL queries:

At the moment there is one table containing all attributes of each article:

article_id | title | ref_id | dial_c_id

The task is to retrieve all articles which match two given attributes (ref_id and dial_c_id) and also retrieve all their other attributes.

With just one table, this is straightforward:

SELECT *
    FROM test.articles_test
WHERE
    ref_id = '127712' 
    AND dial_c_id = 51 

Now in my effort to normalise, I have created a second table, which stores the attributes of each article and removed the ones in table articles:

table 1:

article_id | title 

table 2:

article_id | attr_group | attribute
1            ref_id       51
1            dial_c_id    33
1            another      5
2 ..

I would like to retrieve all article details including ALL attributes which match ref_id and dial_c_id with this two table shema.

Somehow like this:

SELECT 
     a.article_id,
     a.title, 
     attr.*
FROM test.articles_test a
INNER JOIN attributes attr ON a.article_id = attr.article_id
     AND ref_id = '127712' 
     AND dial_c_id = 51 

How can this be done?

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
merlin
  • 2,717
  • 3
  • 29
  • 59
  • What do you mean exactly? Given a ref_id and a dial_c_id, you want to retrieve all the rows of table_2 matching the relative article_id? – Ass3mbler Jan 01 '19 at 17:16
  • I want all article ids, titles, attributes that match those two attributes. One article might have 20 different attributes and I need all those. Without normalization it is just select * – merlin Jan 01 '19 at 17:19
  • So... Is your query running ok and producing the correct results ? – GMB Jan 01 '19 at 17:20

1 Answers1

3

You have used an Entity-Attribute-Value table to record your attributes.

This is the opposite of normalization.

Name the rule of normalization that guided you to put different attributes into the same column. You can't, because this is not a normalization practice.

To accomplish your query with your current EAV design, you need to pivot the result so you get something as if you had your original table.

SELECT * FROM (
    SELECT 
         a.article_id,
         a.title, 
         MAX(CASE attr_group WHEN 'ref_id' THEN attribute END) AS ref_id,
         MAX(CASE attr_group WHEN 'dial_c_id' THEN attribute END) AS dial_c_id
         -- ...others...
    FROM test.articles_test a
    INNER JOIN attributes attr ON a.article_id = attr.article_id
    GROUP BY a.article_id, a.title) AS pivot
WHERE pivot.ref_id = '127712' 
  AND pivot.dial_c_id = 51 

While the above query can produce the result you want, the performance will be terrible. It has to create a temp table for the subquery, containing all data from both tables, then apply the WHERE clause against the temp table.

You're really better off with each attribute in its own column in your original table.


I understand that you are trying to allow for many attributes in the future. This is a common problem.

See my answer to How to design a product table for many kinds of product where each product has many parameters

But you shouldn't call it "normalised," because it isn't. It's not even denormalised. It's derelational.

You can't just use words to describe anything you want — especially not the opposite of what the word means. I can't let the air out of my bicycle tire and say "I'm inflating it."

You commented that you're trying to make your database "scalable." You also misunderstand what the word "scalable" means. By using EAV, you're creating a structure where the queries needed are difficult to write and inefficient to execute, and the data takes 10x space. It's the opposite of scalable.

What you mean is that you're trying to create a system that is extensible. This is complex to implement in SQL, but I describe several solutions in the other Stack Overflow answer to which I linked. You might also like my presentation Extensible Data Modeling with MySQL.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • My reasoning doing this is to create a system that scales. There might be 100 more different attribute groups in the future and with one table I would need to create a new column each time I introduce a new one. – merlin Jan 01 '19 at 17:31
  • 1
    Thank you your extensive updated answer. I think you nailed it by calling it "extensible". Regarding my example, I must say that this two are only the simplified version, as the shema contains now 4 tables. One for naming the groups, one for naming the attributes. So the t in my example contains ids in real. Thank you for the links, I will have a closer look and try to find a better way of designing my app. – merlin Jan 01 '19 at 19:16