mysql can denormalization improve the performance of my query?

Asked Mar 07 '17 at 09:23

Active Mar 07 '17 at 10:28

Viewed 218 times

I have a legacy DB which uses a table similar to this structure:

orders
    id, buyer_id, product

the table saves the product as text instead of referencing to the products table (which exists). The person who setup the table explained his reasoning:

The products are quite unique. i.e. There are only 2 or 3 buyers for each different product. The product table is therefore almost as big as the orders table. He wants to avoid a join to speed up the queries on the orders table.

Does this actually improve performance so it's worth having denormalized data for the following usecase:

buyers table with 10.000 entries
products table with 40.000.000 entries
orders (buyer_product) table with 40.000.000 entries

Normalized: SELECT * FROM buyer_product JOIN products ON products.id = buyer_product.product_id LIMIT 1000;
Denormalized (product saved as text instead of product_id) SELECT * FROM buyer_product LIMIT 1000;

EDIT: For some weird reason 1 even seems to be faster:

SELECT SQL_NO_CACHE bp.buyer_id, product.name
FROM buyer_product bp
JOIN products ON bp.product_id = products.id 
ORDER BY bp.id
LIMIT 1000000;

query time 0.250 sec

SELECT SQL_NO_CACHE bp.buyer_id, bp.product 
FROM buyer_product bp 
ORDER BY bp.id 
LIMIT 1000000;

query time 0.268 sec

edited Mar 07 '17 at 10:28

asked Mar 07 '17 at 09:23

Chris

13,100
23
79
162

This might can help you: http://stackoverflow.com/questions/1102590/what-exactly-does-database-normalization-do – Kinchit Dalwani Mar 07 '17 at 09:32
Of course #2 will be faster, it's just way more harder to maintain integrity. Of course you could have a Foreign Key to the procuct name instead of the id. – dnoeth Mar 07 '17 at 09:40
1

Why don't you test it and see for yourself? – Shadow Mar 07 '17 at 09:41
Both qeuries should also use a ORDER BY id ASC to make sure the results stay the same – Raymond Nijland Mar 07 '17 at 09:46
@RaymondNijland done – Chris Mar 07 '17 at 10:30
@Shadow added results to question – Chris Mar 07 '17 at 10:30
@dnoeth for some weird reason #1 is slightly faster – Chris Mar 07 '17 at 10:31
1

Could be less I/O is required when normalized. Anyway, you have your answer. – John Wu Mar 07 '17 at 10:49
Your select lists are different in the queries. – Shadow Mar 07 '17 at 11:16
@Shadow that is on purpose. #1 selects from the joined table. #2 uses the additional (redundant) column product (varchar) instead of the reference – Chris Mar 07 '17 at 11:59

mysql can denormalization improve the performance of my query?

0 Answers0