0

I'm inserting a new row per order update so I can keep track of the state updates. Like this.


+----------+-----------+--------+-----------------------+-----------------------+----------------------------------+
| event_id |   state   | amount |        address        |         notes         |            timestamp             |
+----------+-----------+--------+-----------------------+-----------------------+----------------------------------+
| order123 | fulfilled | NULL   | NULL                  | NULL                  | 2022-07-01T17:08:12.032316+00:00 |
| order123 | NULL      | NULL   | NULL                  | Delivered to customer | 2022-07-01T17:07:12.032316+00:00 |
| order123 | NULL      | NULL   | 300 Post St, CA 94108 | NULL                  | 2022-07-01T17:06:12.032316+00:00 |
| order123 | accepted  | NULL   | NULL                  | NULL                  | 2022-07-01T17:05:12.032316+00:00 |
| order123 | pending   | 100    | NULL                  | NULL                  | 2022-07-01T17:04:12.032316+00:00 |
+----------+-----------+--------+-----------------------+-----------------------+----------------------------------+


How would I write a query that pulls out the latest value from each of those columns, ignoring the NULLs. This is what I'm looking for.

+----------+-----------+--------+-----------------------+-----------------------+----------------------------------+
| event_id |   state   | amount |        address        |         notes         |            timestamp             |
+----------+-----------+--------+-----------------------+-----------------------+----------------------------------+
| order123 | fulfilled | 100    | 300 Post St, CA 94108 | Delivered to customer | 2022-07-01T17:08:12.032316+00:00 |
+----------+-----------+--------+-----------------------+-----------------------+----------------------------------+

This is as far as I've gotten, which gives me the latest record:

SELECT DISTINCT ON (event_id)
    event_id, state, amount, address, notes, timestamp
    FROM event_order
ORDER BY event_id, timestamp DESC;

DB Fiddle link: https://www.db-fiddle.com/f/pZFUG96oYh6ymAz99vvqWh/0

I came across this post which mentions LAST_VALUE but the solution seems like it only works with integers.

Any help will be appreciated.

realph
  • 4,481
  • 13
  • 49
  • 104

2 Answers2

1

demo


SELECT DISTINCT
    event_id,
    first_value(state) OVER (PARTITION BY event_id ORDER BY CASE WHEN state IS NOT NULL THEN
            0
        ELSE
            1
        END ASC, timestamp DESC),
    first_value(amount) OVER (PARTITION BY event_id ORDER BY CASE WHEN amount IS NOT NULL THEN
            0
        ELSE
            1
        END ASC, timestamp DESC),
    first_value(address) OVER (PARTITION BY event_id ORDER BY CASE WHEN address IS NOT NULL THEN
            0
        ELSE
            1
        END ASC, timestamp DESC),
    first_value(notes) OVER (PARTITION BY event_id ORDER BY CASE WHEN notes IS NOT NULL THEN
            0
        ELSE
            1
        END ASC, timestamp DESC),
    first_value(timestamp) OVER (PARTITION BY event_id ORDER BY CASE WHEN timestamp IS NOT NULL THEN
            0
        ELSE
            1
        END ASC, timestamp DESC)
FROM
    event_order;

credit: PostgreSQL: order by column, with specific NON-NULL value LAST How to ignore nulls in PostgreSQL window functions? or return the next non-null value in a column

jian
  • 4,119
  • 1
  • 17
  • 32
0

Try aggregating with a GROUP BY clause. This will work on the state because the minimum between "accepted" and "pending" is the former. At the same time, the MAX aggregation function will skip NULL values completely.

SELECT event_id, 
       MIN(state), 
       MAX(amount),
       MAX(timestamp)
FROM event_order
GROUP BY event_id

Check the demo here.

lemon
  • 14,875
  • 6
  • 18
  • 38
  • That WHERE clause gets me back the 2nd row (pending). – realph Jul 08 '22 at 14:20
  • Check the edited answer. – lemon Jul 08 '22 at 14:32
  • If I add another entry to that demo with `fulfilled`, it still only returns the `accepted` record. https://www.db-fiddle.com/f/evCCPZ16GefHJ6PGBLP3z2/0 – realph Jul 10 '22 at 12:06
  • You should update your post with fully-standalone borderline sample data with **at least** ten rows on different "*event_id*" values and the corresponding expected output, otherwise we can have back and forth of answers and wrong outcomes, wasting others', my and your very own time. – lemon Jul 10 '22 at 14:04
  • Updated original post. Thanks! – realph Jul 11 '22 at 10:58