Here I implemented a traversal algorithm through the combination of names in an alphabetically sorted order up to N names. The algorithm does not traverse through all 2^N-1
combinations as in the base code mentioned.
The algorithm is explained as follows. Denote the i-th
name is represented by a digit i
(starting from 1), then the combination of names can be represented by a sequence of strictly increasing digits. The strictly increasing constraint eliminates different permutations from the same combination.
Now we can traverse through these representations in an alphabetically increasing order as shown below (like in a real dictionary). One can clearly see, there are 2 keys to the traversal: (1) when to do a round-up and (2) doing it subject to the strictly-increasing constraint. The rest is to implement them correctly in the code.
Disclaimer: Low efficiency, maintainability and scalability! It barely runs on small dataset (38 sec for 1350 combinations from M=20
and N=3
). Consider doing such task in a general purpose programming language especially when the dataset is large. e.g. A pythonic solution with only 4 lines
A concept of what the algorithms does
Combination of names
---------------------
1 -- 1 means 'name1'
2
...
n -- Round up to 2-digit combinations after this number
1,2 -- The lowest digit begins at 2 because (1,1) violates the constraint
1,3
...
1,n
2,3 -- begins at 3 because 2,1 and 2,2 violates the constraint
2,4
....
2,n
.....
(n-1),n -- Can't round up to (n,n) so here ends the combinations of 2 digits.
1,2,3 -- In general, round-up gives a consecutive increasing sequence
-- starting from the rounding digit
1,2,4
.... -- repeat until all wanted combinations are exhausted.
And here is the T-SQL code
Test Dataset and Parameters
use [testdb];
if OBJECT_ID('testdb..test') is not null
drop table testdb..test;
create table test (
[name] VARCHAR(10)
);
GO
-- name list, need not be sorted
insert into test([name])
values ('name1'),('name2'),('name3'),('name4'),('name5');
-- Number of names used in the combination
declare @N int = 3;
Program
/***** temp variables *****/
-- get number of names
declare @M int;
select @M = count(distinct [name]) from test;
--print @M;
-- array (table variable) of temporary status
declare @arr table (
digit int primary key,
val int,
[name] varchar(max)
)
-- the value of each digits begins from 0
insert @arr(digit, val, [name])
select ROW_NUMBER() over(order by [name]), 0, [name] from test;
-- begin traversal from the first name
update @arr
set val = 1
where digit = 1;
-- check:
-- select * from @arr;
-- concatenation of names
DECLARE @cat VARCHAR(max) = ''
-- current value at the lowest digit
declare @val_now int = 1; -- 1 to N
-- digit pointer and temporary value used for rounding
declare @digit_round int;
declare @val_round int;
-- answer storage
if OBJECT_ID('tempdb..#ans') is not null
drop table #ans;
create table #ans (
names varchar(max)
)
/***** Main Loop *****/
declare @flag_terminate int = 0;
while (@flag_terminate = 0) begin
while @val_now <= @M BEGIN
/** output current combination **/
-- select * from @arr;
set @cat = '';
-- use cursor to preserve name order
declare @name_now varchar(100);
declare cur cursor local
for select (select B.[name] from @arr as B where B.digit = A.val) as [name]
from @arr as A
where A.val > 0
order by [name];
open cur;
while 1=1 begin
fetch next from cur into @name_now;
if @@FETCH_STATUS <> 0 break;
set @cat = case @cat
when '' then ''
else @cat + ','
end + @name_now;
end
close cur;
deallocate cur;
-- Can also do this if you don't care about name order
-- Note that 'order by' won't work on subqueries
-- SELECT @cat = case @cat
-- when '' then ''
-- else @cat + ','
-- end + (select B.[name] from @arr as B where B.digit = A.val)
-- FROM @arr as A
-- where A.val > 0;
insert into #ans(names) values (@cat);
/** Proceed to next combination **/
-- 1. no round-up case
if @val_now < @M begin
set @val_now += 1;
update @arr
set val = @val_now
where digit = 1;
end
-- 2. round-up case
else begin
-- Find which digit at where the rounding process should end
set @digit_round = 1
while @digit_round <= @M begin
-- N.B. Ascending alphabatical sorting of name combanitaions implies
-- max(digit 2) = M - 1
-- max(digit 3) = M - 2
-- ....
-- max(digit N) = M - N + 1
set @val_round = (select val from @arr where digit = @digit_round);
if @val_round < @M - @digit_round + 1
break;
set @digit_round += 1;
end
-- print ' digit_round = ' + str(@digit_round)
-- print ' val_round = ' + str(@val_round)
-- 2.1 regular round-up case
-- rounding process ends within N digits
if @digit_round <= @N begin
while @digit_round >= 1 begin
-- rounded value
set @val_round += 1;
-- print 'start rounding:'
-- print ' digit_round = ' + str(@digit_round)
-- print ' val_round = ' + str(@val_round)
-- update temp table
update @arr
set val = @val_round
where digit = @digit_round;
-- next digit
set @digit_round -= 1;
end
-- reset loop vars
set @val_now = @val_round;
end
-- 2.2 termination case
-- rounding process does not end in N digits
else begin
-- print 'Termination reached!'
set @flag_terminate = 1;
set @val_now = @M + 1;
end
end
end
end
-- output
select * from #ans;
Result
| | names |
|----|-------------------|
| 1 | name1 |
| 2 | name2 |
| 3 | name3 |
| 4 | name4 |
| 5 | name5 |
| 6 | name1,name2 |
| 7 | name1,name3 |
| 8 | name1,name4 |
| 9 | name1,name5 |
| 10 | name2,name3 |
| 11 | name2,name4 |
| 12 | name2,name5 |
| 13 | name3,name4 |
| 14 | name3,name5 |
| 15 | name4,name5 |
| 16 | name1,name2,name3 |
| 17 | name1,name2,name4 |
| 18 | name1,name2,name5 |
| 19 | name1,name3,name4 |
| 20 | name1,name3,name5 |
| 21 | name1,name4,name5 |
| 22 | name2,name3,name4 |
| 23 | name2,name3,name5 |
| 24 | name2,name4,name5 |
| 25 | name3,name4,name5 |
Tested on SQL Server 2017 latest (linux docker image).