4

I have a table in SQLite

/* Create a table called NAMES */
CREATE TABLE EVENTS(Id integer , Eventtype integer,value integer,Timestamp DATETIME);

/* Create few records in this table */
INSERT INTO EVENTS VALUES(1,2,1,'2009-01-01 10:00:00');  --ROW1
INSERT INTO EVENTS VALUES(1,2,2,'2007-01-01 10:00:00');  --ROW2
INSERT INTO EVENTS VALUES(2,2,3,'2008-01-01 10:00:00’);  --ROW3

What is needed from the query is ROW1 and ROW3. The query should take most latest row based timestamp for duplicate ID & Eventtype combination. ROW1 and ROW2 are having same eventtype and id but ROW1 is latest so it should be picked.

MPelletier
  • 16,256
  • 15
  • 86
  • 137
Puneet
  • 753
  • 3
  • 9
  • 24

3 Answers3

8

In SQLite 3.7.11 or later, you can use GROUP BY with MAX() to select which row in a group to return:

SELECT *, MAX(timestamp)
FROM events
GROUP BY id, eventtype

In earlier versions, you have to look up some unique ID of the largest row in a group with a subquery (as in you answer).

CL.
  • 173,858
  • 17
  • 217
  • 259
1

I'm a bit late to this question, but I wasn't satisfied with the current answers as they mostly use correlated subqueries which can seriously ruin performance.

In many situations, single-column analytical functions can simulated using a standard join:

SELECT e.*
FROM events e
JOIN
(
    -- Our simulated window with analytical result
    SELECT id, eventtype, MAX(timestamp) AS timestamp
    FROM events
    GROUP BY id, eventtype
) win
USING (id, eventtype, timestamp)

In general, the pattern is:

SELECT main.*
FROM main
JOIN
(
    SELECT
        partition_columns,
        FUNCTION(analyzed_column) AS analyzed_column
    FROM main
    GROUP BY partition_columns
) win
USING (partition_columns, analyzed_column)

These simulated windows aren't perfect:

  1. If your data has ties for your analyzed column result then you may need to remove duplicates from your result set. Otherwise you'll select every row from your partition that matches your analyzed column result.
  2. If you analytical function requires ordering by more than one column, you will need to use correlated subqueries instead. The other answers can be modified to achieve the desired result.
Mr. Llama
  • 20,202
  • 2
  • 62
  • 115
0

I got help from the following link: sqlite equivalent of row_number() over ( partition by ...?

Here is what i came up with:

select * from events E1 where timestamp in
(select timestamp from events E2 where E2.id = E1.id and E2.eventtype=E1.eventtype
                         order by E2.timestamp desc
                         LIMIT 1  );

Also with SQL SERVER, I am thinking of this solution (as I have no way to test)

select id,eventtype,value,ROW_NUMBER() over 
(PARTITION BY id,eventtype,order by timestamp desc) AS RN  from events where RN<=1 ;
Community
  • 1
  • 1
Puneet
  • 753
  • 3
  • 9
  • 24