2

I have a table (which has columns form other tables) with timestamp, value and two idenitification columns with names.

My goal is to get the last entry for every group.

When I do a query with this:

SELECT MAX(time) AS time, 
       id,
       name,
       value
FROM table
WHERE time >= (now() - '00:05:00)
GROUP BY id, name, value

It still shows me double values for name from previous timestamps. When I leave value out all is fine and it shows me each id and name. How can I get to show me the value without the double entry?

My goal would be something like this (short version):

time,value,"id","name"
2022-08-31 11:58:00+02,0,"A","D11"
2022-08-31 11:58:00+02,0,"A","D12"
2022-08-31 11:58:00+02,0,"A","D13"

The table looks like this:

time,value,"id","name"
2022-08-31 11:58:00+02,0,"A","D11"
2022-08-31 11:57:00+02,1,"A","D11"
2022-08-31 11:56:00+02,2,"A","D11"
2022-08-31 11:59:00+02,3,"A","D11"
2022-08-31 11:58:00+02,1,"A","D12"
2022-08-31 11:56:00+02,3,"A","D12"
2022-08-31 11:59:00+02,4,"A","D12"
2022-08-31 11:59:00+02,0,"A","D20"
2022-08-31 11:56:00+02,1,"A","D21"
2022-08-31 11:58:00+02,7,"A","D21"
2022-08-31 11:59:00+02,8,"A","D21"
2022-08-31 11:57:00+02,10,"A","D21"
2022-08-31 11:56:00+02,1,"A","D22"
2022-08-31 11:59:00+02,0,"A","D34"
2022-08-31 11:57:00+02,2,"A","D41"
2022-08-31 11:59:00+02,3,"A","D41"
2022-08-31 11:56:00+02,4,"A","D41"
2022-08-31 11:58:00+02,5,"A","D41"
2022-08-31 11:59:00+02,0,"A","D42"
2022-08-31 11:56:00+02,1,"A","D42"
2022-08-31 11:57:00+02,5,"A","D42"
2022-08-31 11:59:00+02,0,"A","D43"
2022-08-31 11:58:00+02,3,"A","D43"
2022-08-31 11:56:00+02,4,"A","D43"
2022-08-31 11:57:00+02,7,"A","D43"
2022-08-31 11:59:00+02,0,"A","D53"
2022-08-31 11:56:00+02,1,"A","D53"
2022-08-31 11:59:00+02,0,"A","D57"
2022-08-31 11:57:00+02,2,"B","D11"
2022-08-31 11:59:00+02,3,"B","D11"
2022-08-31 11:59:00+02,1,"B","D12"
2022-08-31 11:58:00+02,3,"B","D12"
2022-08-31 11:58:00+02,1,"B","D13"
2022-08-31 11:59:00+02,2,"B","D13"
2022-08-31 11:57:00+02,4,"B","D13"
2022-08-31 11:58:00+02,1,"B","D31"
2022-08-31 11:57:00+02,7,"B","D31"
2022-08-31 11:59:00+02,11,"B","D31"
2022-08-31 11:57:00+02,1,"B","D32"
2022-08-31 11:58:00+02,3,"B","D32"
2022-08-31 11:59:00+02,4,"B","D41"
2022-08-31 11:58:00+02,9,"B","D41"
2022-08-31 11:57:00+02,10,"B","D41"
2022-08-31 11:56:00+02,2,"B","D42"
2022-08-31 11:59:00+02,4,"B","D42"
2022-08-31 11:58:00+02,7,"B","D42"
2022-08-31 11:57:00+02,9,"B","D42"
2022-08-31 11:58:00+02,1,"B","D43"
2022-08-31 11:57:00+02,2,"B","D43"
2022-08-31 11:59:00+02,3,"B","D43"
2022-08-31 11:56:00+02,0,"C","D11"
2022-08-31 11:59:00+02,2,"C","D11"
2022-08-31 11:57:00+02,0,"C","D12"
2022-08-31 11:56:00+02,1,"C","D12"
2022-08-31 11:59:00+02,2,"C","D12"
2022-08-31 11:58:00+02,3,"C","D12"
2022-08-31 11:59:00+02,0,"C","D13"
2022-08-31 11:58:00+02,1,"C","D13"
2022-08-31 11:56:00+02,1,"C","D21"
2022-08-31 11:59:00+02,2,"C","D21"
2022-08-31 11:58:00+02,4,"C","D21"
2022-08-31 11:59:00+02,3,"C","D22"
2022-08-31 11:58:00+02,5,"C","D22"
2022-08-31 11:56:00+02,8,"C","D22"
2022-08-31 11:59:00+02,1,"C","D23"
Gobrel
  • 179
  • 9
  • 1
    not sure. But I think you can also try distinct. see demo: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=e83925d7a89a4b88ee98eeeb3bd27c2b – jian Aug 31 '22 at 13:12
  • @jian This loosk good. Any idea how I can only include the latest time stamp in the result? – Gobrel Sep 01 '22 at 08:16
  • 1
    You can add where clause. demo: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=3cdec3ddfd00835d4f63571482945aa4 @Gobrel – jian Sep 01 '22 at 08:48
  • @jian What my goal with latest time stamp is to see if there is some data missing. The DB get populated every minute with data hence minute interval. And the distinct on method you provided is the right direction. I guess I will do the rest in pandas and select max(time) there since I will analyze the data in pandas. – Gobrel Sep 01 '22 at 08:51
  • 1
    If I understand you correctly: compare all rows, for rows that id and name value is the same then capture only row order by timestamp (desc or asc). That is exactly what distinct do. you can also check this https://stackoverflow.com/questions/3800551/select-first-row-in-each-group-by-group/7630564#7630564 great answer. – jian Sep 01 '22 at 09:00

1 Answers1

2

You can use the rank window function to well, rank, rows per some grouping, and then take the first one:

SELECT time, id, name, value
FROM   (SELECT time, id, name, value, 
               RANK() OVER (PARTITION BY id, name, value ORDER BY time DESC) AS rk
        FROM   mytable
        WHERE  time >= (now() - '00:05:00') t
WHERE  rk = 1
Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • I still get previous values with your solution. There is the last value and also values from previous timestamps and therefore I still have 3 'D11' values. Maybe I need to adjust the 5 minute from now, but I want to make sure I get values which are maybe 3 minutes old and got not updated immediately. – Gobrel Aug 31 '22 at 11:54
  • I did a small adjustment to your code. I included MAX(time) after PARTITION BY and also GROUP BY after WHERE in the subquery. Now it works like intended. – Gobrel Aug 31 '22 at 12:07