3

I have a table of shipments defined like so (the table is stored in a HANA database, if relevant):

CREATE COLUMN TABLE SHIPMENTS (
    ShipmentID INT PRIMARY KEY,
    Received INT,
    Facility NVARCHAR(10),
    Item NVARCHAR(20)
);

Here, the 'Received' column denotes the point in time at which each shipment is received, Facility is where the shipment is received and Item is the content of the shipment.

I have filled it with data like so:

INSERT INTO SHIPMENTS VALUES (1, 0, 'Factory', 'Production machine');
INSERT INTO SHIPMENTS VALUES (2, 0, 'Office', 'Printer');
INSERT INTO SHIPMENTS VALUES (3, 0, 'Factory', 'Coffee maker');
INSERT INTO SHIPMENTS VALUES (4, 1, 'Office', 'Coffee maker');
INSERT INTO SHIPMENTS VALUES (5, 1, 'Factory', 'Fax Machine');
INSERT INTO SHIPMENTS VALUES (6, 2, 'Office', 'Computers');
INSERT INTO SHIPMENTS VALUES (7, 2, 'Factory', 'Fridge');
INSERT INTO SHIPMENTS VALUES (8, 2, 'Factory', 'Freezer');
INSERT INTO SHIPMENTS VALUES (9, 2, 'Office', 'Fax Machine');

I would like to query the database to find, at each point in time, which items have been received up until that point. Based on an answer from another thread, I start by doing this:

SELECT Facility, Received, STRING_AGG (Item, ';') as Items
    FROM (
        SELECT * FROM SHIPMENTS 
        ORDER BY Facility, Received
    )
GROUP BY Facility, Received
ORDER BY Facility, Received;

which results in

  | FACILITY | RECEIVED | ITEMS
---------------------------------------------------------
1 | Factory  | 0        | Production Machine;Coffee maker
2 | Factory  | 1        | Fax Machine
3 | Factory  | 2        | Fridge;Freezer
4 | Office   | 0        | Printer
5 | Office   | 1        | Coffee maker
6 | Office   | 2        | Computers;Fax Machine

However, I would like this

  | FACILITY | RECEIVED | ITEMS
---------------------------------------------------------
1 | Factory  | 0        | Production Machine;Coffee maker
2 | Factory  | 1        | Production Machine;Coffee maker;Fax Machine
3 | Factory  | 2        | Production Machine;Coffee maker;Fax Machine;Fridge;Freezer
4 | Office   | 0        | Printer
5 | Office   | 1        | Printer;Coffee maker
6 | Office   | 2        | Printer;Coffee maker;Computers;Fax Machine

I.e, each row displays what is received at that point, and everything that has already been received. Is there a nice way to do this in SQL?

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
Kazim
  • 133
  • 1
  • 5
  • What have you tried and what error did you get? – Nathaniel Flick Dec 22 '17 at 07:59
  • @NathanielFlick The OP actually did show what they tried, and there is no error, other than the output is not what is desired. – Tim Biegeleisen Dec 22 '17 at 08:00
  • @TimBiegeleisen "Is there a nice way to do this in SQL?" seems to me this question is open ended and not tested? – Nathaniel Flick Dec 22 '17 at 08:01
  • @NathanielFlick It's not open ended at all, the OP gave us the output he wants. I tried a query below, completely untested as I don't know HANA. I'll wait for feedback. – Tim Biegeleisen Dec 22 '17 at 08:05
  • @TimBiegeleisen Fair enough. – Nathaniel Flick Dec 22 '17 at 08:05
  • Why did you use the `SQL Server` tag if you want to ask about *HANA* ? The product matters, a lot. For example SQL Server has windowing fuctions and can calculate a running total, eg `SUM(TicketCount) OVER (ORDER BY [Date] RANGE UNBOUNDED PRECEDING)`. You can probably use it with SQL Server 2017's STRING_AGG too. Does HANA have the OVER clause? – Panagiotis Kanavos Dec 22 '17 at 08:21
  • @PanagiotisKanavos sorry about the mis-tagging, I am pretty new to both sql and HANA and get the flavors of sql mixed up. I finally got the query right (based on the (now deleted (?)) answer by Tim Biegeleisen and will add an answer with the working query. – Kazim Dec 22 '17 at 08:32
  • @Kazim You can edit my answer or just post your own. If the latter, I'll kill my answer again. – Tim Biegeleisen Dec 22 '17 at 08:32
  • @Kazim if you want people that know HANA to even see the question you have to use the `hana` tag. There are thousands of new questions per minute. The home page displays questions based on the reader's favorite tags – Panagiotis Kanavos Dec 22 '17 at 08:36
  • @PanagiotisKanavos He gave the Hana tag...I assumed it was a Hana question and not SQL Server, but yes it was strange seeing those two tags together. – Tim Biegeleisen Dec 22 '17 at 08:37
  • @TimBiegeleisen I added the HANA tag and removed `sql-server`. And as I said, the home page filters questions. I show this question only because it have `sql-server` and `sql` in my favorite tags. Someone who has `hana` wouldn't see it. – Panagiotis Kanavos Dec 22 '17 at 08:42
  • @PanagiotisKanavos Fair enough...yeah I usually see you dwelling on the Microsoft sides of things :-) – Tim Biegeleisen Dec 22 '17 at 08:43

2 Answers2

1

You can try using a correlated query in the select clause to generate the csv data you want:

SELECT
    Facility,
    Received,
    (SELECT STRING_AGG (s2.Item, ';') FROM SHIPMENTS s2
     WHERE s2.Facility = s1.Facility AND s2.Received <= s1.Received
     GROUP BY s2.Facility) AS ITEMS
FROM SHIPMENTS s1
GROUP BY
    Facility,
    Received
ORDER BY
    Facility;
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Doing this I get error message: `correlated subquery cannot have TOP or ORDER BY`. Just to check, I remove `ORDER BY s2.Received`, which gives me the following error: `single-row query returns more than one row`. Any ideas? – Kazim Dec 22 '17 at 08:09
  • @Kazim Try using `GROUP BY s1.Received` in the subquery, and remove the `ORDER BY` clause there (which it sounds like you have already done). – Tim Biegeleisen Dec 22 '17 at 08:13
  • this actually produced exactly the same as the attempt I did in the OP, except not ordered by Facility – Kazim Dec 22 '17 at 08:17
  • @Kazim My last try: `GROUP BY s2.Facility` ... this might make sense because we want a single CSV string for each facility, across receipts. I have never used HANA by the way, so I cannot test this answer. – Tim Biegeleisen Dec 22 '17 at 08:21
  • Your logic seems sound to me, but unfortunately the result is exactly the same as the one in the OP. – Kazim Dec 22 '17 at 08:25
  • I think the problem is the way `STRING_AGG` works...it's not behaving as we would expect. Deleting this answer... – Tim Biegeleisen Dec 22 '17 at 08:26
  • I managed to get it working as desired. I just changed `s1.Items` (inside `STRING_AGG`) into `s2.items`, and with `GROUP BY s2.Facility` inside the subquery. To get my exact desired result I also added `ORDER BY Facility` to the very end. Thanks for the help. – Kazim Dec 22 '17 at 08:36
1

Maybe it could be a good idea to use ORDER BY clause with String_Agg function to make sure that the concatenation will be in desired order

select 
distinct Facility, Received,
(
    select string_agg(s.Item, ';' order by Received, ShipmentID) 
    from Shipments s 
    where 
        s.Facility = t.Facility and
        s.Received <= t.Received
    ) as Items
from Shipments t
Eralper
  • 6,461
  • 2
  • 21
  • 27