2

Suppose, if following rows are inserted in chronological order into a table:

row1, row2, row3, row4, ..., row1000, row1001.

After a while, we delete/remove the latest row1001.

As in this post: How to get Top 5 records in SqLite?
If the below command is run:

SELECT * FROM <table> LIMIT 1;
  • Will it assuredly provide the "row1000"?
  • If no, then is there any efficient way to get the latest row(s) without traversing through all the rows? -- i.e. without using combination of ORDER BY and DESC.

[Note: For now I am using "SQLite", but it will be interesting for me to know about SQL in general as well.]

Community
  • 1
  • 1
iammilind
  • 68,093
  • 33
  • 169
  • 336
  • Are you asking about SQL in general, or SQLite specifically? – jarlh Jan 11 '17 at 11:03
  • 4
    Rows in a relational database are **NOT** "sorted" - there is no such thing as the "chronological order" of rows in a relational database. The only way to define such a thing is to use `ORDER BY` –  Jan 11 '17 at 11:04
  • If you want general SQL, it's up to *you* to ensure that you can write a query that *can* have an appropriate `ORDER BY` when retrieving the data - so if you're not storing enough data in one or more of your columns to be able to write that `ORDER BY`, you're going to be out of look looking for anything guaranteed. – Damien_The_Unbeliever Jan 11 '17 at 11:05

3 Answers3

4

No. Table records have no inherent order. So it is undefined which row(s) to get with a LIMIT clause without an ORDER BY.

SQLite in its current implemantation may return the latest inserted row, but even if that were the case you must not rely on it.

Give a table a datetime column or some sortkey, if record order is important for you.

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • Even if I provide a datetime order, even in that case the query will have to go through all the records to check .. isn't it? Actually such column is already there, but I just want to avoid using it. BTW, does SQLite guarantees the chornological order as you stated in your answer? – iammilind Jan 11 '17 at 11:10
  • I don't know what SQLite does internally when you omit `ORDER BY`. It *may* get you the last inserted record. I don't know. Anyway, even if you would notice that SQLite always returns the last inserted row, you should never rely on this. You need `ORDER BY` in order to work with `LIMIT`. That doesn't mean by the way that the whole table must be read. With an index on the column in question, the DBMS would just have to read few index entries to get to the desired row (provided the DBMS optimizer does a good job here). – Thorsten Kettner Jan 11 '17 at 11:14
  • No, SQLite doesn't guarantee anything without an `ORDER BY`. And with a `LIMIT 1` it is more likely (and did in a ten-second test) to return the _first_ row it finds, not the last (i.e. it will stop processing as soon as it finds the required number of row(s)). – TripeHound Jan 11 '17 at 11:21
  • @TripeHound, Thorsten it seems that the assumption made in this answer might be true for SQLite. I just now found [this link](https://www.tutorialspoint.com/sqlite/sqlite_limit_clause.htm). In that example, it seems that they are giving the definite output instead of possible output with the "LIMIT" clause. Does that mean that SQLite guarantees to pick up the latest chronological ordered row? – iammilind Jan 11 '17 at 12:15
  • No. Again: No DBMS guarantees you a certain record when you apply `LIMIT` without `ORDER BY`. Never use `LIMIT` without `ORDER BY`. The example given in the link lacks `ORDER BY` and is hence wrong. – Thorsten Kettner Jan 11 '17 at 12:23
  • 2
    @iammilind NO IT IS NOT TRUE! First, that example shows the FIRST six (of seven) records, not the last six, so (as I noted) a `LIMIT 1` will OFTEN return the first record, not the last. Second "first" and "last" have NO MEANING without `ORDER BY` and that's a poor tutorial for implying they do. Finally, while SQLite MIGHT repeatedly return the "first" entry, it makes NO guarantee of this. – TripeHound Jan 11 '17 at 14:09
4

You're misunderstanding how SQL works. You're thinking row-by-row which is wrong. SQL does not "traverse rows" as per your concern; it operates on data as "sets".

Others have pointed out that relational database cannot be assumed to have any particular ordering, so you must use ORDER BY to explicitly specify ordering.

However (not mentioned yet is that), in order to ensure it performs efficiently, you need to create an appropriate index.

Whether you have an index or not, the correct query is:

SELECT  <cols>
FROM    <table>
ORDER BY <sort-cols> [DESC] LIMIT <no-rows>

Note that if you don't have an index the database will load all data and probably sort in memory to find the TOP n.

If you do have the appropriate index, the database will use the best index available to retrieve the TOP n rows as efficiently as possible.


Note that the sqllite documentation is very clear on the matter. The section on ORDER BY explains that ordering is undefined. And nothing in the section on LIMIT contradicts this (it simply constrains the number of rows returned).

If a SELECT statement that returns more than one row does not have an ORDER BY clause, the order in which the rows are returned is undefined.

This behaviour is also consistent with the ANSI standard and all major SQL implementations. Note that any database vendor that guaranteed any kind of ordering would have to sacrifice performance to the detriment of queries trying to retrieve data but not caring about order. (Not good for business.)


As a side note, flawed assumptions about ordering is an easy mistake to make (similar to flawed assumptions about uninitialised local variables).

RDBMS implementations are very likely to make ordering appear consistent. They follow a certain algorithm for adding data, a certain algorithm for retrieving data. And as a result, their operations are highly repeatable (it's what we love (and hate) about computers). So things repeatably look the same.

Theoretical examples:

  • Inserting a row results in the row being added to the next available free space. So data appears sequential. But an update would have to move the row to a new location if it no longer fits.
  • The DB engine might retrieve data sequentially from clustered index pages and seem to use clustered index as the 'natural ordering' ... until one day a page-split puts one of the pages in a different location. * Or a new version of the DMBS might cache certain data for performance, and suddenly order changes.

Real-world example:

  • The MS SQL Server 6.5 implementation of GROUP BY had the side-effect of also sorting by the group-by columns. When MS (in version 7 or 2000) implemented some performance improvements, GROUP BY would by default, return data in a hashed order. Many people blamed MS for breaking their queries when in fact they had made false assumptions and failed to ORDER BY their results as needed.

This is why the only guarantee of a specific ordering is to use the ORDER BY clause.

Disillusioned
  • 14,635
  • 3
  • 43
  • 77
  • It seems that in the case of SQLite, we may Not require the "ORDER BY". Would you like to update your answer in the conjunction of [this comment](http://stackoverflow.com/questions/41589096/in-sql-does-the-limit-returns-the-row-which-is-inserted-the-last-in-chronologic/41589275#comment70384665_41589191). – iammilind Jan 11 '17 at 12:16
  • 3
    @iammilind That tutorial you reference makes no explicit claim about ordering without `ORDER BY`. It's unfortunate that it has created a false impression by the way in which it presents its examples. Note that even if it ***had*** claimed some form of natural ordering, it wouldn't be the first tutorial to make a mistake. For a cross-check, please consider the [sqllite documentation](https://www.sqlite.org/lang_select.html): "If a SELECT statement that returns more than one row does not have an ORDER BY clause, the **order in which the rows are returned is undefined**." (emphasis mine) – Disillusioned Jan 11 '17 at 12:43
  • 2
    As a side note, one thing often happens with RDBMS implementations is that certain things make ordering ***appear*** consistent. They follow a certain algorithm for adding data, a certain algorithm for retrieving data. As a result, their operations repeatably ***look*** the same. E.g. It might retrieve sequentially from clustered index pages and seem to use clustered index as the 'natural ordering' ... until one day a page-split puts one of the pages in a different location. Or a new version of the DMBS might cache certain data for performance, and suddenly order changes. – Disillusioned Jan 11 '17 at 12:50
  • 2
    This is why the ***only guarantee*** of a specific ordering is to use the `ORDER BY` clause. (_'Real world' example: The MS SQL Server 6 implementation of `GROUP BY` had the side-effect of also sorting by the group-by-cols. When MS (in version 7 iirc) implemented some performance improvements, `GROUP BY` would by default, return data in a hashed order. Many people blamed MS for breaking their queries when in fact they had made **false assumptions** and failed to `ORDER BY` their results as needed._) – Disillusioned Jan 11 '17 at 12:58
2

In SQL, data is stored in tables unordered. What comes out first one day might not be the same the next.

ORDER BY, or some other specific selection criteria is required to guarantee the correct value.

JohnHC
  • 10,935
  • 1
  • 24
  • 40