1

I have the following product data (for an online shop):

ProductId  ProductOptionGroupId  ProductOptionId
26         1                     13
26         1                     12
44         1                     22
44         1                     23
44         2                     20
44         2                     21
44         3                     25
44         3                     24

Where a ProductOptionGroup would be (say) "Size" or "Colour", and the ProductOption would be (say) "Large", "Extra Large" and "Red", "Black" etc.

Basically, I want to find all possible product option combinations for each product. For example, for product 44, I'd want:

22, 20, 25  (Large, Black, Cotton)
22, 20, 24  (Large, Black, Nylon)
22, 21, 25  (Large, Red, Cotton)
22, 21, 24  (Large, Red, Nylon)
23, 20, 25  (Extra Large, Black, Cotton)
23, 20, 24  etc...
23, 21, 25
23, 21, 24

Only one product option from each product option group for each row. I.e. Large and Extra large are mutually exclusive.

Ideally, I'd like these values concatenated into a single VARCHAR for each product ("22,21,25" etc).

How can this be achieved in SQL Server 2005?

Thanks

Leigh Bowers
  • 707
  • 10
  • 22
  • [Cross join](http://msdn.microsoft.com/en-us/library/ms190690.aspx) maybe? – Magnus Jun 24 '11 at 12:20
  • 2
    Duplicate of your previous question with three answers and you accepted one of them. [How to generate all possible data combinations in SQL?](http://stackoverflow.com/questions/4757739/how-to-generate-all-possible-data-combinations-in-sql) – gbn Jun 24 '11 at 12:22
  • Ah. I'd somehow missed that when searching for possible solutions (I'd completely forgotten I'd asked this previously). Suffice to say, that I never got it working. The requirement is slightly different too. – Leigh Bowers Jun 24 '11 at 12:28

3 Answers3

5
WITH
  data (ProductId, ProductOptionGroupId, ProductOptionId) AS (
    /* defining sample data */
    SELECT 26, 1, 13 UNION ALL
    SELECT 26, 1, 12 UNION ALL
    SELECT 44, 1, 22 UNION ALL
    SELECT 44, 1, 23 UNION ALL
    SELECT 44, 2, 20 UNION ALL
    SELECT 44, 2, 21 UNION ALL
    SELECT 44, 3, 25 UNION ALL
    SELECT 44, 3, 24
  ),
  ranked AS (
    /* ranking the group IDs */
    SELECT
      ProductId,
      ProductOptionGroupId,
      ProductOptionId,
      GroupRank = DENSE_RANK() OVER (PARTITION BY ProductId
                                         ORDER BY ProductOptionGroupId)
    FROM data
  ),
  crossjoined AS (
    /* obtaining all possible combinations */
    SELECT
      ProductId,
      GroupRank,
      ProductVariant = CAST(ProductOptionId AS varchar(250))
    FROM ranked
    WHERE GroupRank = 1
    UNION ALL
    SELECT
      r.ProductId,
      r.GroupRank,
      ProductVariant = CAST(c.ProductVariant + ','
        + CAST(r.ProductOptionId AS varchar(10)) AS varchar(250))
    FROM ranked r
      INNER JOIN crossjoined c ON r.ProductId = c.ProductId
                              AND r.GroupRank = c.GroupRank + 1
  ),
  maxranks AS (
    /* getting the maximum group rank value for every product */
    SELECT
      ProductId,
      MaxRank = MAX(GroupRank)
    FROM ranked
    GROUP BY ProductId
  )
/* getting the max ranked combinations for every product */
SELECT c.ProductId, c.ProductVariant
FROM crossjoined c
  INNER JOIN maxranks m ON c.ProductId = m.ProductId
                       AND c.GroupRank = m.MaxRank

Output:

ProductId   ProductVariant
----------- --------------
26          12
26          13
44          22,20,24
44          22,20,25
44          22,21,24
44          22,21,25
44          23,20,24
44          23,20,25
44          23,21,24
44          23,21,25

Useful reading:

Andriy M
  • 76,112
  • 17
  • 94
  • 154
  • Brilliant! Thank you. I'll now spend some time looking at this to try to understand how you solved my problem... – Leigh Bowers Jun 27 '11 at 10:01
  • 1
    @Milky Joe: Updated the answer with a couple of links. Sorry for having not done that at first. – Andriy M Jun 27 '11 at 11:51
  • You can make this query faster if you calculate `MaxRank` in the `ranked` CTE. – STiLeTT Oct 15 '16 at 07:29
  • Yes, it would probably be faster that way. However, you can't calculate `MaxRank` in the `ranked` CTE. At least I can't see any other way to do so than using `COUNT(DISTINCT …)` as a window function, and SQL Server doesn't support that (not the 2005 version anyway). I'd be happy to learn about other ways, though. – Andriy M Oct 15 '16 at 10:03
  • You can wrap the query in `ranked` CTE in another one and use MAX(GroupRank) OVER (PARTITION BY ProductId). That way you won't need to join with ranked in the main query. – STiLeTT Oct 15 '16 at 17:05
  • 1
    @STiLeTT: Ah yes, I can see that now, thank you. It could be a derived table but could also be another CTE (which would then replace `ranked` as the base for `crossjoined`). The `MaxRank` would simply be carried over in `crossjoined` till it could be filtered on in the main SELECT without any join, just as you said. And I've just remembered that I've recently seen [this DBA.SE answer](http://dba.stackexchange.com/a/151853/6965) suggesting a neat method to calculate `COUNT(DISTINCT)` without derived tables, albeit for non-null data only (might still be applicable in this case). – Andriy M Oct 15 '16 at 18:53
0

Sample:

declare @t table(id int, type1 int, type2 int)

insert @t values(1, 1, 1), (1, 1, 2), (1, 2, 1), (2, 2, 1)

select distinct t1.id, t1.type1, t2.type2
from
(
    select id, type1
    from @t
)t1
full join
(
    select id, type2
    from @t
)t2 on t2.id = t1.id

Output:

id          type1       type2
----------- ----------- -----------
1           1           1
1           1           2
1           2           1
1           2           2
2           2           1
Kirill Polishchuk
  • 54,804
  • 11
  • 122
  • 125
0

The SQL would depend upon your table structure. If the columns are stored in separate tables, then a simple cartesian product (join with no criteria) should yield the desired results.

Kevin
  • 55
  • 1
  • 4