1

I have a table defined like the following...

CREATE table actions (
  id INTEGER PRIMARY KEY AUTO_INCREMENT,
  end BOOLEAN,
  type VARCHAR(15) NOT NULL,
  subtype_a VARCHAR(15),
  subtype_b VARCHAR(15),
);

I'm trying to query for the last end action of some type to happen on each unique (subtype_a, subtype_b) pair, similar to a group by (except SQLite doesn't say what row is guaranteed to be returned by a group by).

On an SQLite database of about 1MB, the query I have now can take upwards of two seconds, but I need to speed it up to take under a second (since this will be called frequently).

example query:

SELECT * FROM actions a_out 
WHERE id = 
  (SELECT MAX(a_in.id) FROM actions a_in 
   WHERE a_out.subtype_a = a_in.subtype_a 
     AND a_out.subtype_b = a_in.subtype_b 
     AND a_in.status IS NOT NULL 
     AND a_in.type = "some_type");

If it helps, I know all the unique possibilities for a (subtype_a,subtype_b)

eg:

(a,1)
(a,2)
(b,3)
(b,4)
(b,5)
(b,6)
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
Blaskovicz
  • 6,122
  • 7
  • 41
  • 50
  • I added the `greatest-n-per-group` tag, and since there's a limit of five tags I had to replace one of the tags you listed. I replaced `sqlite`, because your table definition is MySQL specific. – Bill Karwin Dec 12 '12 at 21:20

3 Answers3

1

Beginning with version 3.7.11, SQLite guarantees which record is returned in a group:

Queries of the form: "SELECT max(x), y FROM table" returns the value of y on the same row that contains the maximum x value.

So can be implemented in a much simpler way:

SELECT *, max(id)
FROM actions
WHERE type = 'some_type'
GROUP BY subtype_a, subtype_b
CL.
  • 173,858
  • 17
  • 217
  • 259
0

Is this any faster?

select * from actions where id in (select  max(id) from actions where type="some_type" group by subtype_a, subtype_b);
palako
  • 3,342
  • 2
  • 23
  • 33
  • No, see what I posted in my question about SQLite and `GROUP BY`. – Blaskovicz Dec 09 '12 at 23:28
  • I know, but this query should work in sqlite, since you are expliciting which one you want from the group. Anyway, I did a quick profile and is not fast, Bill Karwin's answer is great. – palako Dec 10 '12 at 01:26
0

This is the greatest-in-per-group problem that comes up frequently on StackOverflow.

Here's how I solve it:

SELECT a_out.* FROM actions a_out
LEFT OUTER JOIN actions a_in ON a_out.subtype_a = a_in.subtype_a 
    AND a_out.subtype_b = a_in.subtype_b
    AND a_out.id < a_in.id
WHERE a_out.type = "some type" AND a_in.id IS NULL

If you have an index on (type, subtype_a, subtype_b, id) this should run very fast.


See also my answers to similar SQL questions:

Or this brilliant article by Jan Kneschke: Groupwise Max.

Community
  • 1
  • 1
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • This won't return the latest added (highest autoincrement id) for each group of subtype_a,subtype_b – palako Dec 09 '12 at 20:03
  • @palako, my mistake, I got the inequality backwards. I have edited to correct it. It should try to match a greater id value in a_in, and if none is found, then a_out must have the largest id for the respective group. – Bill Karwin Dec 09 '12 at 20:11
  • No problem. This is actually very fast. – palako Dec 09 '12 at 20:27
  • @BillKarwin, I'll give this a try soon, thanks. Can you please explain the `a_in.id IS NULL` part? – Blaskovicz Dec 09 '12 at 23:28
  • For some reason this query returns 0 rows for me. – Blaskovicz Dec 10 '12 at 15:01
  • Well, I just tested it on MySQL 5.5, and my solution works. I guess you don't have any rows where type = "some type". – Bill Karwin Dec 11 '12 at 05:40
  • Bill, I have updated my query; I forgot the "IS NOT NULL" from my original query. Can you please check if your query still applies? Sorry for any confusion. – Blaskovicz Dec 12 '12 at 20:54
  • @Blaskovicz, please click the `greatest-n-per-group` tag and read answers given to other questions. This type of SQL problem has come up dozens of times on Stack Overflow. – Bill Karwin Dec 12 '12 at 21:21