1

Let's suppose that any given client can have multiple boxes. Each box can contain multiple items as well as multiple boxes (sub-boxes).

BoxA -> Item1, Item2, BoxB, BoxC

Unfortunately, due to the business rules, it's possible to create a circular cycle.

BoxA -> Item1, BoxB, BoxC  
BoxB -> BoxA, BoxD

As you can see, BoxA contains BoxB, and BoxB contains BoxA.

The problem I'm attempting to solve is to get all the sub-boxes for a given list of boxes in a client.
So if I was looking for the sub-boxes for BoxA (from the previous example), I would get the following: BoxB, BoxC, BoxA, BoxD.

This is what I have so far:

WITH box_info AS (
    -- This is typically a bit more complicated, that's why I have it in a seperate WITH clause
    SELECT sub_box_id
    FROM client_box
    WHERE box_id = 1
),
all_sub_boxes(sub_box_id) AS (
    SELECT sub_box_id
    FROM box_info
    WHERE sub_box_id IS NOT NULL

    UNION ALL

    SELECT cb.sub_box_id
    FROM client_box cb, all_sub_boxes asb
    WHERE cb.box_id = asb.sub_box_id AND cb.sub_box_id IS NOT NULL
    -- AND cb.sub_box_id NOT IN (SELECT sub_box_id FROM all_sub_boxes)
)
SELECT sub_box_id FROM all_sub_boxes;

However, since it's possible to get stuck in a recursive loop, the "all_sub_boxes" WITH clause will fail. The commented out line is what I would intuitively put in since it prevents already visited sub-boxes from getting added to the recursive list, but it seems that we can't reference "all_sub_boxes" from within.

So essentially, I need a way to not include already included sub-boxes in the recursive query.
Perhaps I could create a temp table? But I don't even know if it's possible to insert into a table during a recursive query. And additionally, what is the cost each time this query is run if we do create a temp table every time?

I'm trying to write this query so that it could be used across different commercial DBs, so if I can avoid non-standard sql, that would be great. But I understand that if it's not possible, then it is what it is.

Edit

For the sake of clarity, here's how the client_box table might look:

+--------+---------+------------+
| BOX_ID | ITEM_ID | SUB_BOX_ID |
+--------+---------+------------+
| BoxA   | Item1   | (null)     |
| BoxA   | (null)  | BoxB       |
| BoxA   | (null)  | BoxC       |
| BoxB   | (null)  | BoxA       |
| BoxB   | (null)  | BoxD       |
+--------+---------+------------+
halfer
  • 19,824
  • 17
  • 99
  • 186
gjvatsalya
  • 1,129
  • 13
  • 29
  • "I was looking for the sub-boxes for BoxA" infers that BoxA is above (parent of) any "sub-boxes" in a hierarchy tree. So you should return B,C,D sub-boxes (children) I would think. Anyway, it sounds like you want a hierarchical tree query (possibly with nocycle). See [here](https://docs.oracle.com/cd/B28359_01/server.111/b28286/queries003.htm#SQLRF52315) for more – tbone Nov 20 '17 at 17:32
  • @tbone, yes, that's accurate. CONNECT BY is an oracle specific keyword though, right? Is there a way to write this query without something specific to Oracle? – gjvatsalya Nov 20 '17 at 17:35
  • No offense, but I never understood the idea that you must strive for generic vanilla flavored SQL at all costs. I assume your company is paying for Oracle licenses (based on your questions oracle tag), so use the tools you are already paying for. There may be hacks involving multiple layers of joins, or complicated (potentially buggy) procedural code, but if your database offers an out of the box solution, I would use that first. – tbone Nov 20 '17 at 17:51
  • That said, I believe other major db either support or will support the concept of hierarchical queries ( [MySQL 8.0](http://mysqlserverteam.com/mysql-8-0-labs-recursive-common-table-expressions-in-mysql-ctes-part-three-hierarchies/) for example) – tbone Nov 20 '17 at 17:52
  • @tbone Only reason I'm asking is because we're probably switching away from Oracle. That's why I'm trying to see if it's possible since I don't want to go through the headache of translating it. But as I said, if it's not possible, then that's okay, I'll just use the CONNECT BY clause. – gjvatsalya Nov 20 '17 at 17:54
  • Please explain the structure of your data. This means: table name - column names and data types - relationships between those columns (parent box in column A, items or child boxes in Column B, for example). You explained the setup in words, but it is not entirely clear how to map words like "client can **have** multiple boxes" - how is **have** reflected in database concepts? Same with your notation `->` - what does that mean in the database? –  Nov 20 '17 at 18:03
  • Aside from that, everything you can do with CONNECT BY (hierarchical Oracle) queries you can also do with recursive CTE (recursive WITH clause), which has been in the SQL Standard for some time. Oracle implemented the recursive WITH clause in version 11.2. Other db vendors have also implemented it, while still others have not. So... what db product may you move to, when you will switch away from Oracle? –  Nov 20 '17 at 18:05
  • @mathguy Sorry about that, I've edited my question and added the table markdown. Let me know if you need any other information. You can just ignore the client info that I mentioned in text, I probably should not have included that. And I believe my query draft is a recursive CTE, right? I'm not 100% sure. Also - we're probably going to switch to postgres, but nothing official yet. – gjvatsalya Nov 20 '17 at 18:12
  • Yes, it is! It looks very much like the solution I am about to post; the only help you seem to need is on how to deal with cycles. See what I do in the solution; then read about the CYCLE clause to recursive CTE. –  Nov 20 '17 at 18:25

1 Answers1

2

You were on the right track, and you seem to just need a little help dealing with cycles. See the CYCLE clause right at the end of the recursive CTE definition (even though the CYCLE clause comes AFTER the closing parenthesis for the recursive CTE, it is still part of it):

with
-- Begin simulated data.
  client_box ( box_id, item_id, sub_box_id ) as (
    select 'BoxA', 'Item1', null   from dual union all
    select 'BoxA', null   , 'BoxB' from dual union all
    select 'BoxA', null   , 'BoxC' from dual union all
    select 'BoxB', null   , 'BoxA' from dual union all
    select 'BoxB', null   , 'BoxD' from dual
  ),
-- End of simulated data (for testing only, not part of the solution).
-- SQL query consists of the keyword WITH (above) and the lines below.
-- Use your actual table and column names.
-- Use whatever mechanism works for you in the ANCHOR branch of r (below).
  r ( box_id ) as (
    select  'BoxA' from dual   --  Modify this for inputs
    union all
    select  c.sub_box_id
      from  client_box c join r on c.box_id = r.box_id
      where c.sub_box_id is not null
  )
  cycle box_id set cycle to 1 default 0
select box_id
from   r
where  cycle = 0
;

BOX_ID
------
BoxA
BoxB
BoxC
BoxD
  • Thank you so much for posting this. However, I'm trying to find more information about the CYCLE keyword, and there's nothing definite about it. I'm probably just looking in the wrong places. Could you give me more information or point me to the proper documentation? Also, is it an Oracle specific keyword? – gjvatsalya Nov 20 '17 at 19:15
  • @gjvatsalya - See if this helps. Also, find out if PostgreSQL now has anything like the cycle clause, it didn’t eight years ago, it seems. https://stackoverflow.com/questions/1731889/cycle-detection-with-recursive-subquery-factoring –  Nov 20 '17 at 20:12