This seems to be a question that comes up over and over and a few bits of code are
kicking about that address the problem. A very nice algorithm in some code has been
written but it wasn't strictly clean C and not portable across UNIX or Linux or any
POSIX system, therefore I cleaned it up and added warning messages, usage and the
ability to provide a set size and sub_set size on the command line. Also comb[] has
been transitioned to a more general pointer to an array of integers and calloc used
to zero out the memory needed for whatever set size one may want.
The following is ISO IEC 9899:1999 C clean :
/*********************************************************************
* The Open Group Base Specifications Issue 6
* IEEE Std 1003.1, 2004 Edition
*
* An XSI-conforming application should ensure that the feature
* test macro _XOPEN_SOURCE is defined with the value 600 before
* inclusion of any header. This is needed to enable the
* functionality described in The _POSIX_C_SOURCE Feature Test
* Macro and in addition to enable the XSI extension.
*
* Compile with c99 or with gcc and CFLAGS to include options
* -std=iso9899:199409 -pedantic-errors in order to ensure compliance
* with ISO IEC 9899:1999 C spec.
*
* Code cleanup and transition to comb as a pointer to type ( int * )
* array by Dennis Clarke dclarke@blastwave.org 28 Dec 2012
*
*********************************************************************/
#define _XOPEN_SOURCE 600
#include <stdio.h>
#include <stdlib.h>
/* Prints out a combination like {1, 2} */
void printc( int *comb, int k) {
int j;
printf("{ ");
for ( j = 0; j < k; ++j )
printf("%d , ", *( comb + j ) + 1 );
printf( "\b\b}\n" );
} /* printc */
/**********************************************************************
next_comb(int comb[], int k, int n)
Generates the next combination of n elements as k after comb
comb => the previous combination ( use (0, 1, 2, ..., k) for first)
k => the size of the subsets to generate
n => the size of the original set
Returns: 1 if a valid combination was found
0, otherwise
**********************************************************************/
int next_comb( int *comb, int k, int n) {
int i = k - 1;
++*( comb + i );
while ( ( i >= 0 ) && ( *( comb + i ) >= n - k + 1 + i ) ) {
--i;
++*( comb + i );
}
if ( *comb > n - k) /* Combination (n-k, n-k+1, ..., n) reached */
return 0; /* No more combinations can be generated */
/* comb now looks like (..., x, n, n, n, ..., n).
* Turn it into (..., x, x + 1, x + 2, ...) */
for (i = i + 1; i < k; ++i)
*( comb + i ) = *( comb + ( i - 1 ) ) + 1;
return 1;
} /* next_comb */
int main(int argc, char *argv[]) {
int *comb, i, n, k;
n = 9; /* The size of the set; for {1, 2, 3, 4} it's 4 */
k = 6; /* The size of the subsets; for {1, 2}, {1, 3}, .. it's 2 */
if ( argc < 3 ) {
printf ( "\nUSAGE : %s n k\n", argv[0] );
printf ( " : Where n is the set size and k the sub set size.\n" );
printf ( " : Note that k <= n\n" );
return ( EXIT_FAILURE );
}
n = atoi ( argv[1] );
k = atoi ( argv[2] );
if ( k > n ) {
printf ( "\nWARN : k > n is not allowed.\n" );
printf ( "USAGE : %s n k\n", argv[0] );
printf ( " : Where n is the set size and k the sub set size.\n" );
printf ( " : Note that k <= n\n" );
return ( EXIT_FAILURE );
}
comb = ( int * ) calloc( (size_t) k, sizeof(int) );
for ( i = 0; i < k; ++i)
*( comb + i ) = i;
/* Print the first combination */
printc( comb, k );
/* Generate and print all the other combinations */
while ( next_comb( comb, k, n ) )
printc( comb, k );
free ( comb );
return ( EXIT_SUCCESS );
}
One may compile the above on an Opteron based machine thus :
$ echo $CFLAGS
-m64 -g -malign-double -std=iso9899:199409 -pedantic-errors -mno-mmx
-mno-sse -fexceptions -fpic -fvisibility=default -mtune=opteron
-march=opteron -m128bit-long-double -mpc80 -Wl,-q
$ gcc $CFLAGS -o combinations combinations.c
A quick trivial test with a set size of 10 and a sub-set of 6 will be thus :
$ ./combinations 10 6 | wc -l
210
The math is correct :
( 10 ! ) / ( ( 10 - 6 )! * ( 6! ) ) = 210 unique combinations.
Now that the integer array comb is based on a pointer system we are only restricted
by available memory and time. Therefore we have the following :
$ /usr/bin/time -p ./combinations 20 6 | wc -l
real 0.11
user 0.10
sys 0.00
38760
This looks correct :
( 20 ! ) / ( ( 20 - 6 )! * ( 6! ) ) = 38,760 unique combinations
We may now push the limits a bit thus :
$ ./combinations 30 24 | wc -l
593775
Again the math agrees with the result :
( 30 ! ) / ( ( 30 - 24 )! * ( 24! ) ) = 593 775 unique combinations
Feel free to push the limits of your system :
$ /usr/bin/time -p ./combinations 30 22 | wc -l
real 18.62
user 17.76
sys 0.83
5852925
I have yet to try anything larger but the math looks correct as well as the output
thus far. Feel free to let me know if some correction is needed.
Dennis Clarke
dclarke@blastwave.org
28 Dec 2012