1

I need a query which can be used in (or as) a function and retrieves N combinations of M values from a table.

Example: Input: table with values in one column in multiple rows

Case 1

N=2
M=4 (Record1 to Record4)

Table

Record1
Record2
Record3
Record4

Output

Record1
Record2
Record3
Record4
Record1,Record2
Record1,Record3
Record1,Record4
Record2,Record3
Record2,Record4
Record3,Record4

Case 2

N=3
M=4 (Record1 to Record4)

Table

Record1
Record2
Record3
Record4

Output

Record1
Record2
Record3
Record4
Record1,Record2
Record1,Record3
Record1,Record4
Record2,Record3
Record2,Record4
Record3,Record4
Record1,Record2,Record3
Record1,Record2,Record4
Record1,Record3,Record4
Record2,Record3,Record4

I am using this question as base code for execution

iminiki
  • 2,549
  • 12
  • 35
  • 45
  • Rather telling us you *"need"*, please tell us what the question you have is. What have *you* tried to do to solve this problem yourself? Why didn't it work? – Thom A Oct 04 '19 at 08:02
  • 1
    Interesting question. It will be a recursive CTE plus dynamic pivot I think. – TomC Oct 04 '19 at 08:23

3 Answers3

4

If only a fixed amount of N values per combination is needed, then it can easily be done in a normal SQL.

Simply by using N-1 self-joins.

For example if N = 3 then 2 self-joins :

SELECT 
CONCAT(t1.name, ',', t2.name, ',', t3.name) AS names
FROM yourtable t1
JOIN yourtable t2 ON t2.name > t1.name
JOIN yourtable t3 ON t3.name > t2.name;

Because of the use of > in the joins, that wouldn't return duplicates of the same combinations in a different order.
(Since A,B,C = A,C,B = B,A,C = B,C,A = C,A,B = C,B,A)

If N is variable, then such method could be used in a Dynamic Sql that adds N-1 joins to the query string.

However, to get the expected output of the question, to return also N=1 & N=2 & N=3 then we could use that trick in combination with a Recursive CTE.

For example this T-SQL snippet:

declare @yourtable table ([name] varchar(30));
insert into @yourtable ([name]) values 
('Record1'), 
('Record2'), 
('Record3'), 
('Record4');

WITH RCTE AS 
(
    SELECT 1 as N, t.name as prev_name, cast(t.name as varchar(max)) AS names
    FROM @yourtable t

    UNION ALL

    SELECT N + 1, t.name, CONCAT(r.names,','+ t.name)
    FROM @yourtable t
    JOIN RCTE r ON t.name > r.prev_name AND r.N < 3
)
SELECT names
FROM RCTE
ORDER BY N, names;

Returns:

names
------------------------
Record1
Record2
Record3
Record4
Record1,Record2
Record1,Record3
Record1,Record4
Record2,Record3
Record2,Record4
Record3,Record4
Record1,Record2,Record3
Record1,Record2,Record4
Record1,Record3,Record4
Record2,Record3,Record4
LukStorms
  • 28,916
  • 5
  • 31
  • 45
2

Here I implemented a traversal algorithm through the combination of names in an alphabetically sorted order up to N names. The algorithm does not traverse through all 2^N-1 combinations as in the base code mentioned.

The algorithm is explained as follows. Denote the i-th name is represented by a digit i (starting from 1), then the combination of names can be represented by a sequence of strictly increasing digits. The strictly increasing constraint eliminates different permutations from the same combination.

Now we can traverse through these representations in an alphabetically increasing order as shown below (like in a real dictionary). One can clearly see, there are 2 keys to the traversal: (1) when to do a round-up and (2) doing it subject to the strictly-increasing constraint. The rest is to implement them correctly in the code.

Disclaimer: Low efficiency, maintainability and scalability! It barely runs on small dataset (38 sec for 1350 combinations from M=20 and N=3). Consider doing such task in a general purpose programming language especially when the dataset is large. e.g. A pythonic solution with only 4 lines

A concept of what the algorithms does

Combination of names
---------------------
1      -- 1 means 'name1'
2
...
n      -- Round up to 2-digit combinations after this number
1,2    -- The lowest digit begins at 2 because (1,1) violates the constraint
1,3
...
1,n
2,3    -- begins at 3 because 2,1 and 2,2 violates the constraint
2,4
....
2,n
.....
(n-1),n  -- Can't round up to (n,n) so here ends the combinations of 2 digits.
1,2,3    -- In general, round-up gives a consecutive increasing sequence 
         -- starting from the rounding digit
1,2,4
....     -- repeat until all wanted combinations are exhausted. 

And here is the T-SQL code

Test Dataset and Parameters

use [testdb];
if OBJECT_ID('testdb..test') is not null
    drop table testdb..test;

create table test (
    [name] VARCHAR(10)
);
GO

-- name list, need not be sorted
insert into test([name]) 
values ('name1'),('name2'),('name3'),('name4'),('name5');

-- Number of names used in the combination
declare @N int = 3;  

Program

/***** temp variables *****/

-- get number of names
declare @M int;
select @M = count(distinct [name]) from test;
--print @M;

-- array (table variable) of temporary status
declare @arr table (
    digit int primary key,
    val int,
    [name] varchar(max)
)
-- the value of each digits begins from 0
insert @arr(digit, val, [name])
select ROW_NUMBER() over(order by [name]), 0, [name] from test;

-- begin traversal from the first name
update @arr
    set val = 1
    where digit = 1;
-- check:
-- select * from @arr;

-- concatenation of names
DECLARE @cat VARCHAR(max) = ''

-- current value at the lowest digit
declare @val_now int = 1;  -- 1 to N

-- digit pointer and temporary value used for rounding
declare @digit_round int;
declare @val_round int;

-- answer storage
if OBJECT_ID('tempdb..#ans') is not null
    drop table #ans;
create table #ans (
    names varchar(max)
)

/***** Main Loop *****/

declare @flag_terminate int = 0;

while (@flag_terminate = 0) begin
    while @val_now <= @M BEGIN

        /** output current combination **/

        -- select * from @arr;

        set @cat = '';
        -- use cursor to preserve name order
        declare @name_now varchar(100);    
        declare cur cursor local
        for select (select B.[name] from @arr as B where B.digit = A.val) as [name]
            from @arr as A
            where A.val > 0
            order by [name];
        open cur;

        while 1=1 begin
            fetch next from cur into @name_now;
            if @@FETCH_STATUS <> 0 break;
            set @cat = case @cat
                           when '' then ''
                           else @cat + ','
                       end + @name_now;
        end
        close cur;
        deallocate cur;

        -- Can also do this if you don't care about name order
        -- Note that 'order by' won't work on subqueries
        -- SELECT @cat = case @cat
        --                   when '' then ''
        --                   else @cat + ','
        --               end + (select B.[name] from @arr as B where B.digit = A.val)
        -- FROM @arr as A
        -- where A.val > 0;

        insert into #ans(names) values (@cat);

        /** Proceed to next combination **/

        -- 1. no round-up case
        if @val_now < @M begin
            set @val_now += 1;
            update @arr
                set val = @val_now
                where digit = 1;
        end
        -- 2. round-up case
        else begin

            -- Find which digit at where the rounding process should end
            set @digit_round = 1            
            while @digit_round <= @M begin
                -- N.B. Ascending alphabatical sorting of name combanitaions implies
                --      max(digit 2) = M - 1 
                --      max(digit 3) = M - 2
                --      ....
                --      max(digit N) = M - N + 1
                set @val_round = (select val from @arr where digit = @digit_round);
                if @val_round < @M - @digit_round + 1
                    break;
                set @digit_round += 1;
            end
            -- print '    digit_round = ' + str(@digit_round)
            -- print '    val_round = ' + str(@val_round)

            -- 2.1 regular round-up case
            --     rounding process ends within N digits
            if @digit_round <= @N begin

                while @digit_round >= 1 begin
                    -- rounded value
                    set @val_round += 1;
                    -- print 'start rounding:'
                    -- print '    digit_round = ' + str(@digit_round)
                    -- print '    val_round = ' + str(@val_round)

                    -- update temp table
                    update @arr
                        set val = @val_round
                        where digit = @digit_round;
                    -- next digit
                    set @digit_round -= 1;
                end
                -- reset loop vars
                set @val_now = @val_round;
            end
            -- 2.2 termination case
            --     rounding process does not end in N digits
            else begin
                -- print 'Termination reached!'
                set @flag_terminate = 1;
                set @val_now = @M + 1;
            end
        end
    end
end

-- output
select * from #ans;

Result

|    | names             |
|----|-------------------|
| 1  | name1             |
| 2  | name2             |
| 3  | name3             |
| 4  | name4             |
| 5  | name5             |
| 6  | name1,name2       |
| 7  | name1,name3       |
| 8  | name1,name4       |
| 9  | name1,name5       |
| 10 | name2,name3       |
| 11 | name2,name4       |
| 12 | name2,name5       |
| 13 | name3,name4       |
| 14 | name3,name5       |
| 15 | name4,name5       |
| 16 | name1,name2,name3 |
| 17 | name1,name2,name4 |
| 18 | name1,name2,name5 |
| 19 | name1,name3,name4 |
| 20 | name1,name3,name5 |
| 21 | name1,name4,name5 |
| 22 | name2,name3,name4 |
| 23 | name2,name3,name5 |
| 24 | name2,name4,name5 |
| 25 | name3,name4,name5 |

Tested on SQL Server 2017 latest (linux docker image).

Bill Huang
  • 4,491
  • 2
  • 13
  • 31
1

What can be maximum value of N & M ?

I am using RBAR and dynamic Sql and CROSS JOIN.

DECLARE @N INT = 3
DECLARE @M INT = 4 --M=4 (Record1 to Record4)

-- Purpose of #temp
-- Insert your resultset in #temp using RowNumber or Top (where RowNumber=@M or Top(@M))
-- This is easy part 

CREATE TABLE #temp (
    id INT identity(1, 1)
    ,col VARCHAR(50)
    )

INSERT INTO #temp
VALUES ('Record1')
    ,('Record2')
    ,('Record3')
    ,('Record4')





DECLARE @i INT = 1
DECLARE @Sql NVARCHAR(500) = ''
DECLARE @Col NVARCHAR(500) = ''
DECLARE @From NVARCHAR(500)
DECLARE @Where NVARCHAR(500) = ''
DECLARE @Alias NVARCHAR(5)

WHILE (@i <= @N)
BEGIN
    SET @Alias = 't' + cast(@i AS VARCHAR)

    IF (@i = 1)
    BEGIN
        SET @Col = CONCAT (
                @Alias
                ,'.' + 'col'
                )
        SET @From = '#temp ' + @Alias
        SET @Sql = 'select ' + @Col + ' from  ' + @From + ''
    END
    ELSE IF (@i = 2)
    BEGIN
        SET @Col = @Col + '+'',''' + '+' + CONCAT (
                @Alias
                ,'.' + 'col'
                ,' '
                )
        SET @From = @From + ',' + '#temp ' + @Alias
        SET @Where = @Where + 't' + cast(@i - 1 AS VARCHAR) + '.id < t' + cast(@i AS VARCHAR) + '.id'
        SET @Sql = @Sql + ' union all select ' + @Col + '' + 'from ' + @From + ' where ' + @Where
    END
    ELSE IF (@i > 2)
    BEGIN
        SET @Col = @Col + '+'',''' + '+' + CONCAT (
                @Alias
                ,'.' + 'col'
                ,' '
                )
        SET @From = @From + ',' + '#temp ' + @Alias
        SET @Where = @Where + ' and ' + 't' + cast(@i - 1 AS VARCHAR) + '.id < t' + cast(@i AS VARCHAR) + '.id'
        SET @Sql = @Sql + ' union all select ' + @Col + '' + 'from ' + @From + ' where ' + @Where
    END

    SET @i = @i + 1
END

PRINT @Col
PRINT @Sql

EXECUTE sp_executesql @Sql

DROP TABLE #temp

It is working correctly for N=1,2,3,4.

KumarHarsh
  • 5,046
  • 1
  • 18
  • 22