2

Lets say I have a table for a bill of material that looks like the following:

+----------+-------------+
| ParentNo | ComponentNo |
+----------+-------------+
| AAA      |         101 |
| AAA      |         102 |
| BBB      |         201 |
| BBB      |         202 |
| CCC      |         101 |
| CCC      |         201 |
| CCC      |         301 |
| DDD      |         101 |
| DDD      |         102 |
+----------+-------------+

I would like to write a query that find any parent numbers that share identical sets of components. I am flexible on the output, but my initial idea is that it would be similar to:

+----------+---------+
| ParentNo | Matched |
+----------+---------+
| AAA      | DDD     |
| DDD      | AAA     |
+----------+---------+

There are multiple ways to find individual records that share the same value for one or multiple fields as discussed in Finding duplicate values in a SQL table. But I only want to return a result when all the components for two different parents match.

The closest solution I have seen to this is SQL selecting rows where one column's value is common across another criteria column. This does not satisfy my need because a set of specific ComponentNo's must be provided. I am trying to avoid manually entering in the ComponentNo's for each individual ParentNo (the datasets that I am working with contain tens of millions of records).

I feel like some form of a recursive query might be appropriate in this situation, but my attempts have not yielded any queries that seem like they are on the right path.

Community
  • 1
  • 1
cornking
  • 25
  • 8

4 Answers4

0

You can use LISTAGG to concatenate all the components for a parent. Then use a self join to check the parents having the same components.

with grouped_comps as (
select parentno, LISTAGG(cast(componentno as varchar(10000)), ',') WITHIN GROUP(ORDER BY componentno) as comp_all
from t
group by parentno)
select g1.parentno,g2.parentno
from grouped_comps g1
join grouped_comps g2 on g1.parentno<>g2.parentno and g1.comp_all=g2.comp_all
Vamsi Prabhala
  • 48,685
  • 4
  • 36
  • 58
  • I think you should put `WITHIN GROUP (ORDER BY)` with `LISTAGG` – Gurwinder Singh Dec 21 '16 at 17:50
  • It looks like LISTAGG is not recognized by the database that I am using. Is there an alternative to this function that would be appropriate for this query? – cornking Dec 21 '16 at 23:13
  • what's your dbms then? is it not db2? because the [documentation](http://www.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r0058709.html) says this is a valid function. – Vamsi Prabhala Dec 21 '16 at 23:19
  • It looks like it's DB2 z/OS 10.1.5. I am getting the following error: – cornking Dec 21 '16 at 23:32
  • I guess I'm not allowed to edit my comments, but the error code is `1) [Error Code: -199, SQL State: 42601] DB2 SQL Error: SQLCODE=-199, SQLSTATE=42601, SQLERRMC=GROUP;;, FROM INTO, DRIVER=4.18.60. 2) [Error Code: -514, SQL State: 26501] DB2 SQL Error: SQLCODE=-514, SQLSTATE=26501, SQLERRMC=SQL_CURLH200C1, DRIVER=4.18.60` – cornking Dec 21 '16 at 23:34
  • have you replaced in the query the exact tablenames and column names? – Vamsi Prabhala Dec 21 '16 at 23:40
  • @vkp - Yes, I have substituted the exact table names and column names into the query. I have also tried using only `select parentno, LISTAGG(cast(componentno as varchar(10000)), ',') WITHIN GROUP(ORDER BY componentno) as comp_all from myTable` in order to see if I could get the `LISTAGG` function to work, but I am still getting the same error showing up. It is also worth mentioning that I simplified the sample table in my question. The table that I am querying has more than two fields - I'm not sure if this makes a difference. – cornking Dec 21 '16 at 23:58
  • did you omit `group by`? – Vamsi Prabhala Dec 22 '16 at 00:00
  • @vkp Sorry about that, I did not omit `GROUP BY`. I have tried omitting `WITHIN GROUP(ORDER BY ComponentNo) as comp_all`. I feel like there is something about the `varchar` declaration that it isn't liking. Just for reference, ComponentNo is a `CHAR(10)` datatype in the table. – cornking Dec 22 '16 at 00:14
  • sure..can you remove the `cast` and try it? like `LISTAGG(componentno, ',') WITHIN GROUP(ORDER BY componentno)` – Vamsi Prabhala Dec 22 '16 at 00:16
  • @vkp Without the `cast` I still get the same error. I tried using `varchar(LISTAGG (ComponentNo, ',') WITHIN GROUP (ORDER BY ComponentNo), 10000) as comp_all` and I get a different error: `1) [Error Code: -104, SQL State: 42601] DB2 SQL Error: SQLCODE=-104, SQLSTATE=42601, SQLERRMC=WITHIN;;OVER CONCAT || / - + OPTION AT * ORDER , AS YEAR YEARS MONTH, DRIVER=4.18.60. 2) [Error Code: -514, SQL State: 26501] DB2 SQL Error: SQLCODE=-514, SQLSTATE=26501, SQLERRMC=SQL_CURLH200C1, DRIVER=4.18.60` – cornking Dec 22 '16 at 00:20
  • see my comment above..what you just did is invalid – Vamsi Prabhala Dec 22 '16 at 00:21
  • @vkp I first tried `LISTAGG(componentno, ',') WITHIN GROUP(ORDER BY componentno)` as you suggested above and that returned the same error. After that, I tried the modified code in my comment which returned a different error. I have used `varchar(table.column, 123) as foo` in a previous unrelated recursive query and it was valid. I'm guessing that format is not compatible with the `LISTAGG` function? – cornking Dec 22 '16 at 00:36
0

try Something like this

--Create temprary table with rownumber
with tmp0 as (
select f0.*, rownumber() over(partition by parentno order by componentno) rang, (select count(*) from jgrun.tmp1222 f1 where f1.parentno=f0.parentno) as nb   
from yourlib.yourtable f0
),

--Create temporary table recurse with componentno concatenante
tmp1 (parentno, componentno, listcomponentno, rang, nb, rgcal) as (
select parentno, componentno,  cast(tmp0.componentno  as varchar(500)), rang, nb, 1 from tmp0
union all
select tmp0.parentno,  tmp0.componentno,  cast(tmp0.componentno || ', ' || tmp1.listcomponentno as varchar(500)), tmp0.rang, tmp0.nb, tmp1.rgcal+1 
from tmp0 inner join tmp1 on tmp0.parentno=tmp1.parentno  and tmp1.rang-1=tmp0.rang
),
--Select last concatenation
tmp2 as (
select * from tmp1
where nb=rgcal
)
--Cross parent different and same list component no
select f1.parentno, f2.parentno, f1.listcomponentno
 from tmp2 f1 inner join tmp2 f2 on f1.parentno<>f2.parentno and f1.listcomponentno=f2.listcomponentno
Esperento57
  • 16,521
  • 3
  • 39
  • 45
0

Another possibility:

with data (bk,komp) as (values
('AAA', 101),('AAA', 102),('BBB', 201),
('BBB', 202),('CCC', 101),('CCC', 201),
('CCC', 301),('DDD', 101),('DDD', 102)
)
select d1.bk,d2.bk
from   data d1
inner  join data d2
on     d1.bk <> d2.bk
group  by d1.bk,d2.bk
having count(distinct d1.komp) = 
       count(case when d1.komp = d2.komp then 1 end)
user2398621
  • 86
  • 1
  • 3
-1
 SELECT Parent.ParentNo,Child.ParentNo FROM 
 TABLE AS Parent
 LEFT JOIN TABLE as Child
 ON Child.ComponentNo = Parent.ComponentNo
 Where Parent.ParentNo != Child.ParentNo
LONG
  • 4,490
  • 2
  • 17
  • 35