2

So I am creating sections for a store. The store can have multiple scopes, if there isn't a section_identifier set for a given store_id it should fallback to the global store which is 0.

The SQL command that I want should return a list of section_options for any related given store.

Example of my table:

SELECT * FROM my_table:

+----+--------------------+----------------------+----------+
| id | section_identifier | option_identifier    | store_id |
+----+--------------------+----------------------+----------+
| 17 | header             | header_option_one    |        1 |
| 18 | footer             | footer_option_one    |        0 |
| 19 | homepage_feature   | homepage_feature_one |        0 |
| 23 | header             | header_option_three  |        0 |
| 25 | homepage_feature   | homepage_feature_one |        1 |
+----+--------------------+----------------------+----------+

So section_identifier is unique, the IDs I need back for store 1 would be 17, 18 and 25.

When I run:

SELECT * FROM my_table GROUP BY section_identifier it returns:

+----+--------------------+----------------------+----------+
| id | section_identifier | option_identifier    | store_id |
+----+--------------------+----------------------+----------+
| 18 | footer             | footer_option_one    |        0 |
| 23 | header             | header_option_three  |        0 |
| 19 | homepage_feature   | homepage_feature_one |        0 |
+----+--------------------+----------------------+----------+

This means if I run SELECT * FROM my_table GROUP BY section_identifier DESC:

I get the response (this is my desired output):

+----+--------------------+----------------------+----------+
| id | section_identifier | option_identifier    | store_id |
+----+--------------------+----------------------+----------+
| 25 | homepage_feature   | homepage_feature_one |        1 |
| 17 | header             | header_option_one    |        1 |
| 18 | footer             | footer_option_one    |        0 |
+----+--------------------+----------------------+----------+

Although this works, I have no understanding of as to why.

Its my understanding the initial GROUP BY should get the first instance in the database, IE the response I expect should be:

+----+--------------------+----------------------+----------+
| id | section_identifier | option_identifier    | store_id |
+----+--------------------+----------------------+----------+
| 18 | footer             | footer_option_one    |        0 |
| 17 | header             | header_option_three  |        1 |
| 19 | homepage_feature   | homepage_feature_one |        0 |
+----+--------------------+----------------------+----------+

However, it seems to be referencing my store_id somehow? I have tried a few different combinations and Im weirdly getting my expected result each time but I have no understanding as to why.

Can anybody explain this to me please?

PS

I have tried updating the option_identifier of id = 7 to see if MySql references the latest saved on disk and it didn't change the result.

Also: I'm not planning on using this feature or asking for an alternative, I'm asking what's going on with it?

Brideo
  • 163
  • 5
  • Something is pretty weird here... Are these the actual tables, data and queries? You also mention that the `section_identifier` is unique, I suppose you mean unique per `store_id`? – xpy Mar 02 '16 at 13:08
  • Yes the indexes of the table are `store_id` and `section_id`. IE these two values must be unique. – Brideo Mar 02 '16 at 13:10

2 Answers2

2
SELECT * FROM my_table GROUP BY section_identifier

is an invalid SQL query.

How GROUP BY works?

Let's get the query above and see how GROUP BY works. First the database engine selects all the rows that match the WHERE clause. There is no WHERE clause in this query; this means all the rows of the table are used to generate the result set.

It then groups the rows using the expressions specified in the GROUP BY clause:

+----+--------------------+----------------------+----------+
| id | section_identifier | option_identifier    | store_id |
+----+--------------------+----------------------+----------+
| 17 | header             | header_option_one    |        1 |
| 23 | header             | header_option_three  |        0 |
+----+--------------------+----------------------+----------+
| 18 | footer             | footer_option_one    |        0 |
+----+--------------------+----------------------+----------+
| 19 | homepage_feature   | homepage_feature_one |        0 |
| 25 | homepage_feature   | homepage_feature_one |        1 |
+----+--------------------+----------------------+----------+

I marked the groups in the listing above to make everything clear.

On the next step, from each group the database engine produces a single row. But how?

The SELECT clause of your query is SELECT *. * stands for the full list of table columns; in this case, SELECT * is a short way to write:

SELECT id, section_identifier, option_identifier, store_id

Let's analyze the values of column id for the first group. What value should the database engine choose for id? 17 or 23? Why 17 and why 23?

It does not have any criteria to favor 17 over 23. It just picks one of them (probably 17 but this depends on a lot of internal factors) and goes one.

There is no problem to determine the value for section_identifier. It is the column used to GROUP BY, all its values in a group are equal.

The choosing dilemma occurs again on columns option_identifier and store_id.


According to the standard SQL your query is not valid and it cannot be executed. However, some database engines run it as described above. The values for expressions that are not (at least one of the below):

  • used in the GROUP BY clause;
  • used with GROUP BY aggregate functions in the SELECT clause;
  • functionally dependent of columns used in the GROUP BY clause;

are indeterminate.

Since version 5.7.5, MySQL implements functional dependency detection and, by default, it rejects an invalid GROUP BY query like yours.

How to make it work

It's not clear for me how you want to get the result set. Anyway, if you want to get some rows from the table then GROUP BY is not the correct way to do it. GROUP BY does not select rows from a table, it generates new values using the values from the table. A row generated by GROUP BY, most of the times, do not match any row from the source table.

You can find a possible solution to your problem in this answer. You'll have to write the query yourself after you read and understand the idea (and is very clear to you how the "winner" rows should be selected).

Community
  • 1
  • 1
axiac
  • 68,258
  • 9
  • 99
  • 134
0

GROUP BY sorts records in ascending order by default. Your store_id is not being referenced at all instead the records returned are sorted in ascending order of the section_identifier

Seda
  • 243
  • 1
  • 9