0

So i understand and found posts that indicates that it is not recommended to omit the order by clause in a SQL query when you are retrieving data from the DBMS.

Resources & Post consulted (will be updated):

SQL Server UNION - What is the default ORDER BY Behaviour

When no 'Order by' is specified, what order does a query choose for your record set?

https://dba.stackexchange.com/questions/6051/what-is-the-default-order-of-records-for-a-select-statement-in-mysql

Questions :

See logic of the question below if you want to know more.

My question is : under mysql with innoDB engine, does anyone know how the DBMS effectively gives us the results ?

I read that it is implementation dependent, ok, but is there a way to know it for my current implementation ?

Where is this defined exactly ?

Is it from MySQL, InnoDB , OS-Dependent ?

Isn't there some kind of list out there ?

Most importantly, if i omit the order by clause and get my result, i can't be sure that this code will still work with newer database versions and that the DBMS will never give me the same result, can i ?

Use case & Logic :

I'm currently writing a CRUD API, and i have table in my DB that doesn't contain an "id" field (there is a PK though), and so when i'm showing the results of that table without any research criteria, i don't really have a clue on what i should use to order the results. I mean, i could use the PK or any field that is never null, but it wouldn't make it relevant. So i was wondering, as my CRUD is supposed to work for any table and i don't want to solve this problem by adding an exception for this specific table, i could also simply omit the order by clause.

Final Note :

As i'm reading other posts, examples and code samples, i'm feeling like i want to go too far. I understand that it is common knowledge that it's just a bad practice to omit the Order By clause in a request and that there is no reliable default order clause, not to say that there is no order at all unless you specify it.

I'd just love to know where this is defined, and would love to learn how this works internally or at least where it's defined (DBMS / Storage Engine / OS-Dependant / Other / Multiple criteria). I think it would also benefit other people to know it, and to understand the inners mechanisms in place here.

Thanks for taking the time to read anyway ! Have a nice day.

NoxWorld
  • 13
  • 1
  • 6
  • 2
    there is not a default order by .. .. if you need an order by you must declare explicitally .. if you don't need .. you can omit .. and if you omit order by you can surely use in all the sql version you prefer .. is a basic concept for SQL – ScaisEdge Apr 13 '18 at 11:19
  • The ANSI SQL standard doesn't say anything about default order without an `ORDER BY` clause, so even if MySQL were to say something, you still should not rely on it. Always use `ORDER BY` if you need a certain ordering in your result set. – Tim Biegeleisen Apr 13 '18 at 11:20
  • There is not a default order in any MySql table. The order you get the info is the order the systems finds the info, and it depends on how the system can organize the files, how many insert you make, how many deletes you make and so on. – nacho Apr 13 '18 at 11:50
  • I know there is no default order. And i get that the results are just given by the DBMS without any kind of sort. I would just want to know if someone knows some kind of logic the system uses on a regular basis. EG : I always do SELECT * FROM table1 LIMIT 50, and just change an Offset, considering that the table didn't change (no insert, update or delete used), is it ok to omit order clause, and is there a way to know the logic of the system ? (For example FIFO, LIFO , or anything, or is it impossible to know ?) – NoxWorld Apr 13 '18 at 12:27
  • No, in that case you mentioned it is NOT ok to omit order by, because there is no guaranty of the order the system it's going to find the info. That is what we try to explain. It depends on what blocks of disk the system has in memory, what you have done before, other apps running in the system. There is not something like FIFO or LIFO so as you say, it is impossible to know, even do a lot of times you will get the same order, but not always – nacho Apr 13 '18 at 12:47
  • @Nacho, okay, thanks for the confirmation. That was my guess but i wanted to be 100% sure. You're actually the first one mentionning current system memory and disk blocks, so thanks for this clarification. – NoxWorld Apr 13 '18 at 12:52

2 Answers2

5

Without a clear ORDER BY, current versions of InnoDB return rows in the order of the index it reads from. Which index varies, but it always reads from some index. Even reading from the "table" is really an index—it's the primary key index.

As in the comments above, there's no guarantee this will remain the same in the next version of InnoDB. You should treat it as a coincidental behavior, it is not documented and the makers of MySQL don't promise not to change it.

Even if their implementation doesn't change, reading in index order can cause some strange effects that you might not expect, and which won't give you query result sets that makes sense to you.

For example, the default index is the clustered index, PRIMARY. It means index order is the same as the order of values in the primary key (not the order in which you insert them).

mysql> create table mytable ( id int primary key, name varchar(20));

mysql> insert into mytable values (3, 'Hermione'), (2, 'Ron'), (1, 'Harry');

mysql> select * from mytable;
+----+----------+
| id | name     |
+----+----------+
|  1 | Harry    |
|  2 | Ron      |
|  3 | Hermione |
+----+----------+

But if your query uses another index to read the table, like if you only access column(s) of a secondary index, you'll get rows in that order:

mysql> alter table mytable add key (name);

mysql> select name from mytable;
+----------+
| name     |
+----------+
| Harry    |
| Hermione |
| Ron      |
+----------+

This shows it's reading the table by using an index-scan of that secondary index on name:

mysql> explain select name from mytable;
+----+-------------+---------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table   | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+---------+-------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | mytable | index | NULL          | name | 83      | NULL |    3 | Using index |
+----+-------------+---------+-------+---------------+------+---------+------+------+-------------+

In a more complex query, it can become very tricky to predict which index InnoDB will use for a given query. The choice can even change from day to day, as your data changes.

All this goes to show: You should just use ORDER BY if you care about the order of your query result set!

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Thanks for this complete answer ! This is exactly the kind of information i was looking for. Say i have 100 rows in my table, and i want to select them 20 at a time, i don't really care the order criteria, but this is going to be 5 différent requests with a limit clause and i obviously want to avoid showing the same records. What should i use to do that without knowing the columns ? Is there a way to order by primary key or something like that ? I mean, without having to query the information schema every time ? – NoxWorld Apr 17 '18 at 08:52
  • Don't use `LIMIT` without `ORDER BY`. If you order by the order it was already returning rows (in index-access order), then the optimizer knows it can do the ordering as a no-op. But if the index access changes, it'll have to sort the result set. But I don't think using different LIMIT offsets will change the index choice. – Bill Karwin Apr 17 '18 at 14:27
2

Bill's answer is good. But not complete.

  • If the query is a UNION, it will (I think) deliver first the results of the first SELECT (according to the rules), then the results of the second. Also, if the table is PARTITIONed, it is likely to do a similar thing.

  • GROUP BY may sort by the grouping expressions, thereby leading to a predictable order, or it may use a hashing technique, which scrambles the rows. I don't know how to predict which.

  • A derived table used to be an ordered list that propagates into the parent query's ordering. But recently, the ORDER BY is being thrown away in that subquery! (Unless there is a LIMIT.)

Bottom Line: If you care about the order, add an ORDER BY, even if it seems unnecessary based on this Q & A.

MyISAM, in contrast, starts with this premise: The default order is the order in the .MYD file. But DELETEs leave gaps, UPDATEs mess with the gaps, and INSERTs prefer to fill in gaps over appending to the file. So, the row order is rather unpredictable. ALTER TABLE x ORDER BY y temporarily sets the .MYD order; this 'feature' does not work for InnoDB.

Rick James
  • 135,179
  • 13
  • 127
  • 222