1

UPDATE Here's a sqlfiddle http://sqlfiddle.com/#!2/e0822/1/0

I have a MySQL database of apps (itunes_id), each app id has a comments field. To preserve a history, every time a comment is changed, a new row of data is added. In the query below, I just want a list of the latest entry (highest id) of every app (itunes_id).

Here are the headers of my db:

id  (key and auto increment)
itunes_id
comments 
date

This query is getting the latest entry for a given itunes_id. How can I make this query more efficient?

SELECT * FROM (
    SELECT * FROM (
        SELECT * FROM Apps
        ORDER BY id DESC
    ) AS apps1
    GROUP BY itunes_id
) AS apps2
LIMIT 0 , 25
Dev01
  • 13,292
  • 19
  • 70
  • 124
  • 1
    can you create a sql fiddle for us? – Woot4Moo Feb 25 '13 at 16:48
  • Stop selecting `*`. That's like going to the grocery store and filling your shopping cart with every product in food group you want. – Kermit Feb 25 '13 at 16:48
  • @AarolamaBluenk awful advice, sometimes `*` is necessary – Woot4Moo Feb 25 '13 at 16:49
  • Actually WootWoot, the advise based on your example might be good advise to not use so many `Select *` you don't need that many Selects personally – MethodMan Feb 25 '13 at 16:51
  • @Woot4Moo Give me a sec, I'll build a sql fiddle. Thanks. – Dev01 Feb 25 '13 at 16:51
  • 2
    @Woot4Moo The question is about efficiency. Selecing all is [not efficient](http://stackoverflow.com/questions/65512/which-is-faster-best-select-or-select-column1-colum2-column3-etc). – Kermit Feb 25 '13 at 16:51
  • @DJKRAZE That's what I was thinking. Way too many SELECTS but how do I reduce them. – Dev01 Feb 25 '13 at 16:52
  • @AarolamaBluenk myopic view in my experienced opinion. – Woot4Moo Feb 25 '13 at 16:52
  • In my experience, `select *` isn't lazy, it is sloppy. @Woot4Moo, would you please post an example where `select *` is necessary? I'd love to be proven wrong. – BryanH Feb 25 '13 at 16:55
  • @Woot4Moo Not sure how to respond to that... ***Any*** experienced DBA will tell you it's not efficient. – Kermit Feb 25 '13 at 16:56
  • @BryanH sure. I create a materialized view that requires all of the columns in my table, but also does a very lengthy computation. I then proceed to create a regular view on top of this materialized view to facilitate user queries so they don't know the structure of my table. Let me see if I can make a fiddle to illustrate what I mean (assume a placeholder function that takes an hour to execute). – Woot4Moo Feb 25 '13 at 16:57
  • @Woot4Moo Once you do that, I will create an equal view that uses a column list. Then, we can both add 50 columns to the parent tables and compare how efficient they are. – Kermit Feb 25 '13 at 16:59
  • 1
    @Woot4Moo . . . The order by in the innermost query is not correct coding practice. The results are indeterminate. You can read the MySQL documentation for more information about this . . .http://dev.mysql.com/doc/refman/5.0/en/group-by-extensions.html. – Gordon Linoff Feb 25 '13 at 16:59
  • @AarolamaBluenk and yet my explain plan is THE EXACT SAME in every query I have ever written in Oracle – Woot4Moo Feb 25 '13 at 16:59
  • @Woot4Moo You're moving more data... so how is moving more (unnecessary) data more efficient? – Kermit Feb 25 '13 at 17:04

4 Answers4

1

This query uses a subquery which separately gets the maximum ID for every itunes_ID. The result of the subquery is then join back on the original table provided that it matches on two columns: itunes_ID and ID.

SELECT  a.*
FROM    Apps a
        INNER JOIN
        (
            SELECT  itunes_id, MAX(ID) max_id
            FROM    Apps
            GROUP   BY itunes_id
        ) b ON a.itunes_id = b.itunes_id AND
                a.ID = b.max_ID
LIMIT   0, 25

For faster performance, create a compound column INDEX on columns itunes_ID and ID. EG,

ALTER TABLE Apps ADD INDEX (itunes_ID, ID)
John Woo
  • 258,903
  • 69
  • 498
  • 492
  • Nice. I want to make this more efficient because I'm getting a lot of traffic. I'm willing to make changes to the db as needed to make this as efficient as possible (even willing to change db types). I added the index column, how do I use it? http://sqlfiddle.com/#!2/0ff75/1/0 – Dev01 Feb 25 '13 at 17:36
  • You have nothing to do with it..it's the server that will use it :D see the executiom plan, youll see tht it uses the index – John Woo Feb 25 '13 at 17:42
  • If you are able to make changes to the DB, I would recommend changing your DB structure to keep the history in a separate table. (See my answer below.) – Matthias Wuttke Feb 25 '13 at 18:01
1

For a similar approach, I use a "recent" boolean field to mark records containing the latest version. This requires an UPDATE query on every insert (deactivate the previous recent record), but allows for a quick select query. Alternatively, you could maintain two tables, one with the recent records, the other one with the history for each app.

EDIT: Maybe you can try a table similar to this:

  • id int not null auto_increment primary key
  • version int not null
  • main_id int null
  • recent boolean not null
  • app varchar(32) not null
  • comment varchar(200) null

You can use the column "main_id" to point to the record with version 1.

Matthias Wuttke
  • 1,982
  • 2
  • 21
  • 38
0

I just want the latest entry (highest id) for a given app (itunes_id)

This will do it

SELECT MAX(id), comments FROM Apps WHERE itunes_id = "iid";

or

SELECT id, comments FROM Apps WHERE itunes_id = "iid" ORDER BY id DESC LIMIT 1;

Where iid is the itunes id for which you want the latest comment.

Make sure id and itunes_id are indexed in a composite index for maximum efficiency.

BryanH
  • 5,826
  • 3
  • 34
  • 47
  • Sorry, I meant to write "I just want a list of the latest entry (highest id) of every app (itunes_id)." I updated the text int he question above. – Dev01 Feb 25 '13 at 17:04
0
SELECT * FROM (
    SELECT * FROM (
        SELECT * FROM Apps
        ORDER BY id DESC
    ) AS apps1
    GROUP BY itunes_id
) AS apps2
LIMIT 0 , 25

will not select the oldest record (you cannot assume the generated key will always be the "oldest"). What you want is something like this:

SELECT * FROM (
    SELECT * FROM (
        SELECT * FROM Apps
        where some_date = (select max(some_date) from Apps limit 1)
        ORDER BY id DESC
    ) AS apps1
    GROUP BY itunes_id
) AS apps2
LIMIT 0 , 25
Woot4Moo
  • 23,987
  • 16
  • 94
  • 151
  • Good point. But you also added another select statement. Doesn't that also make the query even slower? – Dev01 Feb 25 '13 at 17:05