3

I have table with following data :

Order_ID status_update status_date
A Received 01/01/2020
A Pending 01/05/2020
A Processing 01/07/2020
A Delivered 01/15/2020
B Received 02/01/2020
C Received 02/15/2020
C Delivered 02/20/2020
D Received 05/01/2020
D Delivered 05/10/2020

Now, I would like to get full order history between 01/15/2020 to 04/25/2020 with last status update is Delivered.

Result would look like :

Order_ID status_update status_date
A Received 01/01/2020
A Pending 01/05/2020
A Processing 01/07/2020
A Delivered 01/15/2020
C Received 02/15/2020
C Delivered 02/20/2020

How can i write an SQL query to get this result?

kn9
  • 93
  • 1
  • 9
  • `with last status update is Delivered` leaves room for interpretation. You later commented `from 2020-01-15 and still get full history of the order` Indicates you want the full order history of orders that were *delivered* in a given time frame. Please clarify the question. – Erwin Brandstetter Aug 10 '21 at 22:10

4 Answers4

3

Assuming you want the history for all orders that were delivered between 2020-01-01 and 2020-04-25:

SELECT *
FROM   orders o
WHERE  EXISTS (
   SELECT FROM orders o1
   WHERE  o1.order_id = o.order_id
   AND    o1.status_update = 'Delivered'
   AND    o1.status_date BETWEEN '2020-01-15' AND '2020-04-25'
   )
ORDER  BY order_id, status_date

That includes rows before 2020-01-15, as long as the delivery date is in the time frame. Add another predicate to the outer SELECT to cut off rows before 2020-01-01 (or any date):

AND    o.status_date >= '2020-01-15'

db<>fiddle here

Aside: always use unambiguous ISO format for dates.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 2
    "...always use unambiguous ISO format for dates..." +1 just for that. – The Impaler Aug 10 '21 at 02:07
  • Both answers are great. I just don't know how to make 2 answers as accepted :) – kn9 Aug 10 '21 at 21:30
  • @kn9: Choosing is your privilege. I'll say this, though: *db<>fiddle [here](https://dbfiddle.uk/?rdbms=postgres_13&fiddle=2049ef8630c6bd623b2cb1615d49ca8a)* About performance: test with EXPLAIN ANALYZE, or see: https://stackoverflow.com/a/9065976/939860 – Erwin Brandstetter Aug 11 '21 at 01:37
3

You can use something like this:

select * from test a
inner join (
  select distinct order_id from test
  where status_update = 'Delivered'
) b on a.order_id = b.order_id
where status_date between '2020-01-01' and '2020-04-25'

Example: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=fddcc53f80e94c2c1c48f4e557904505

Result:

order_id | status_update | status_date | order_id
:------- | :------------ | :---------- | :-------
A        | Received      | 2020-01-01  | A       
A        | Pending       | 2020-01-15  | A       
A        | Processing    | 2020-01-07  | A       
A        | Delivered     | 2020-01-15  | A       
C        | Received      | 2020-02-15  | C       
C        | Delivered     | 2020-02-20  | C       

The same query will work for:

Edit

If you want to get all records for order ID that got delivered between 15th Jan and 25th April, you can do this:

select * from test a
inner join (
  select distinct order_id from test
  where status_update = 'Delivered'
  and status_date between '2020-01-15' and '2020-04-25'
) b on a.order_id = b.order_id

That'll give you the result, I believe, you desire.

Example: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=5e0d226f455068fafa54228837a8313f

zedfoxus
  • 35,121
  • 5
  • 64
  • 63
  • I would argue that the `DISTINCT` is not really necessary since PostgreSQL would add it by itself automatically. Anyway +1 for all three solutions. – The Impaler Aug 10 '21 at 02:06
  • @zedfoxus, Thank you - Is it possible to change this query someway if i query for delivered order data from 2020-01-15 and still get full history of the order? https://dbfiddle.uk/?rdbms=postgres_13&fiddle=1edfe83a0f19522338add1442d52d1fe – kn9 Aug 10 '21 at 02:26
  • @kn9 Yes, I think I understand what you are asking. Let me add an edit along with an example to show you how you can tweak the query to get the results I think you desire. – zedfoxus Aug 10 '21 at 04:07
1

You don't mention the specific database you are using, but the query below should work on most databases:

select *
from t
where status_date between date '2020-01-01' and date '2020-04-25'
  and order_id in (
  select order_id
  from t
  where status_update = 'Delivered'
);

Result:

 order_id  status_update  status_date 
 --------- -------------- ----------- 
 A         Received       2020-01-01  
 A         Pending        2020-01-15  
 A         Processing     2020-01-07  
 A         Delivered      2020-01-15  
 C         Received       2020-02-15  
 C         Delivered      2020-02-20  
The Impaler
  • 45,731
  • 9
  • 39
  • 76
1

Try the following query:

SELECT Order_ID, status_update, status_date
FROM (
  SELECT
    *,
    SUM(CASE status_update WHEN 'Delivered' THEN 1 ELSE 0 END) OVER(
      PARTITION BY Order_ID
    ) AS with_delivery
  FROM test
  WHERE status_date BETWEEN '2020-01-01' and '2020-04-25'
) AS t 
WHERE with_delivery > 0;

The result is as you expect:

+----------+---------------+-------------+
| order_id | status_update | status_date |
+----------+---------------+-------------+
| A        | Received      | 2020-01-01  |
| A        | Pending       | 2020-01-15  |
| A        | Processing    | 2020-01-07  |
| A        | Delivered     | 2020-01-15  |
| C        | Received      | 2020-02-15  |
| C        | Delivered     | 2020-02-20  |
+----------+---------------+-------------+

db<>fiddle

Andrei Odegov
  • 2,925
  • 2
  • 15
  • 21