0

I have a table in my database that I am using a SQL query to retrieve data from. In my query, I am replacing some text and using integers. The query returns the data below:

user_id | event_code | total_bookmarks | total_folders | folder_depth | ts
0         8            34                6               1              128926
0         8            35                6               1              129001
4         8            18                2               1              123870
6         8            30                2               1              130099
6         8            30                2               1              132000
6         8            30                2               1              147778

The query I am using is:

SELECT
user_id,
event_code,
CAST(REPLACE(data1, 'total bookmarks', '') AS INTEGER) as total_bookmarks,
CAST(REPLACE(data2, 'folders', '') AS INTEGER) as total_folders,
CAST(REPLACE(data3, 'folder depth ', '') AS INTEGER) as folder_depth,
timestamp AS ts
FROM events
WHERE event_code = 8

What do I need to add to my query in order to only select the rows for each unique user_id with the max ts (timestamp) for each id? I tried MAX(timestamp), but I get two rows returned for the same ID if the total_bookmark is different (example: user_id 0 having 34 in one row, and 35 in another) I want the table to look like this:

user_id | event_code | total_bookmarks | total_folders | folder_depth | ts
0         8            34                6               1              129001
4         8            18                2               1              123870
6         8            30                2               1              147778
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
cjh193
  • 309
  • 2
  • 3
  • 9
  • Why is this tagged with `mysql` and `postgresql`? – shmosel Dec 13 '16 at 01:51
  • Because I'm using postgresql, but I assume this can be done the same or in a similar way with mysql – cjh193 Dec 13 '16 at 02:03
  • 1
    Please don't do that. You're wasting your own time and everyone else's. – shmosel Dec 13 '16 at 02:04
  • Voted to close question as duplicate, because it's yet another instance of the very common [tag:greatest-n-per-group] problem. Linked to a question that has a well-written answer for PostgreSQL. – Bill Karwin Dec 13 '16 at 02:11

1 Answers1

0
Declare @table table (user_id int, event_code int, total_bookmarks int, total_folders int, folder_depth int, ts decimal(18,0))
Insert into @table (user_id , event_code , total_bookmarks , total_folders , folder_depth , ts)
Values (0,8,34,6,1,128926),
        (0,8,34,6,1,129001),
        (4,  8,  18 ,  2,   1,   123870),
        (6,  8,  30,   2,   1,   130099),
        (6,  8,  30,   2,   1,   132000),
        (6,  8,  30,   2,   1,   147778)

Select * from @table

Select      user_id,event_code,total_bookmarks,total_folders,folder_depth,ts
From        (
            Select      RANK() over (Partition by user_id
                                    Order by ts desc
                                    ) as Rank,
                        user_id,event_code,total_bookmarks,total_folders,folder_depth,ts

            From        @table
            ) D1
Where       D1.Rank = 1
Thao Phan
  • 11
  • 2