4

I have a string 27AAGCB5913L2ZF. If any of A or J or K appear in the string then it I need to change them to all possible combinations of the three letters. If I pass the above string input to the program then the output should be like this

27AAGCB5913L2ZF
27AJGCB5913L2ZF
27AKGCB5913L2ZF
27JAGCB5913L2ZF
27KAGCB5913L2ZF
27KJGCB5913L2ZF
27JKGCB5913L2ZF
27JJGCB5913L2ZF
27KKGCB5913L2ZF

The letters may be present anywhere in string. If just a single letter is present then it must be replaced by A, J and K in turn. For example the output for the string 27ABGCB5913L2ZF should be as follows

27ABGCB5913L2ZF
27JBGCB5913L2ZF
27kBGCB5913L2ZF

I can search for a given character with code like this

while ( $string =~ /(B)/g ) {
    say $1, ' at ', pos $string;
}

How can I produce all possible strings if there could be any number of A, J, or K in any positions?

Borodin
  • 126,100
  • 9
  • 70
  • 144
Amol
  • 143
  • 9
  • If single letter K present then it must have to replace A,J & K on their own position. For example 27KBGCB5913L2ZF then Output should be as .. 27ABGCB5913L2ZF 27JBGCB5913L2ZF 27kBGCB5913L2ZF – Amol Sep 18 '18 at 08:44
  • Identify the lists of possibilities for each position, then create a Cartesian product of those lists. Anyway, questions on Stack Overflow should not tag multiple languages unless it is *necessary* to write code in *both* languages to answer the question. Either ask about one language at a time, or ask in a language-agnostic way about an algorithm. – Karl Knechtel Mar 01 '23 at 22:29
  • I found a more precise duplicate right after reopening: https://stackoverflow.com/questions/14841652 – Karl Knechtel Mar 01 '23 at 23:02

4 Answers4

4

The collection of combinations that you want of your key letters A, J, and K is called the Cartesian product. In Python, you can use itertools.product to generate them.

Firstly, we need to find the positions of all of the key letters in the input string. The easy way to do that uses the built-in enumerate function. Once I know those positions, and how many key letters the string contains, we can generate each item of the Cartesian product, replace the key letters, and print the new string.

In Python, strings are immutable (they can't be changed), so I convert the string to a list of characters, replace the characters in the key positions, and then build a new string from the list using the str.join method.

The following code will work with both versions 2 and 3 of Python

Python

from itertools import product

def make_patterns(s):

    keyletters = 'AJK'

    # Convert input string into a list so we can easily substitute letters
    seq = list(s)

    # Find indices of key letters in seq
    indices = [ i for i, c in enumerate(seq) if c in keyletters ]

    # Generate key letter combinations & place them into the list
    for t in product(keyletters, repeat=len(indices)):
        for i, c in zip(indices, t):
            seq[i] = c
        print(''.join(seq))

# Test

data = (
    '1ABC2',
    '27AAGCB5913L2ZF',
    '3A4J',
    '5K67KA',
)

for s in data:
    print('\nInput:', s)
    make_patterns(s)

output

Input: 1ABC2
1ABC2
1JBC2
1KBC2

Input: 27AAGCB5913L2ZF
27AAGCB5913L2ZF
27AJGCB5913L2ZF
27AKGCB5913L2ZF
27JAGCB5913L2ZF
27JJGCB5913L2ZF
27JKGCB5913L2ZF
27KAGCB5913L2ZF
27KJGCB5913L2ZF
27KKGCB5913L2ZF

Input: 3A4J
3A4A
3A4J
3A4K
3J4A
3J4J
3J4K
3K4A
3K4J
3K4K

Input: 5K67KA
5A67AA
5A67AJ
5A67AK
5A67JA
5A67JJ
5A67JK
5A67KA
5A67KJ
5A67KK
5J67AA
5J67AJ
5J67AK
5J67JA
5J67JJ
5J67JK
5J67KA
5J67KJ
5J67KK
5K67AA
5K67AJ
5K67AK
5K67JA
5K67JJ
5K67JK
5K67KA
5K67KJ
5K67KK

With a minor change, we can turn our function into a generator. That lets you loop over the output strings easily, or turn them into a list if you want.

Python

from itertools import product

def make_patterns(s):

    keyletters = 'AJK'

    # Convert input string into a list so we can easily substitute letters
    seq = list(s)

    # Find indices of key letters in seq
    indices = [i for i, c in enumerate(seq) if c in keyletters]

    # Generate key letter combinations & place them into the list
    for t in product(keyletters, repeat=len(indices)):
        for i, c in zip(indices, t):
            seq[i] = c
        yield ''.join(seq)

# Test

print(list(make_patterns('A12K')))

for s in make_patterns('3KJ4'):
    print(s)

output

['A12A', 'A12J', 'A12K', 'J12A', 'J12J', 'J12K', 'K12A', 'K12J', 'K12K']
3AA4
3AJ4
3AK4
3JA4
3JJ4
3JK4
3KA4
3KJ4
3KK4
Borodin
  • 126,100
  • 9
  • 70
  • 144
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
  • Forgive my prejudice, but ***"With a minor change, we can turn our function into a generator"*** (as well as the education about Cartesian products) smacks of someone fresh from college and desperate for points. There is no call for an iterator here: it will always be called to exhaustion and may as well be delivered as a simple list. Let's not turn acrobatic tricks to solve ordinary problems. – Borodin Sep 18 '18 at 18:15
  • 3
    @Borodin In Python 3, many functions that previously returned lists in early versions now return iterators. And for this task, I think an iterator is appropriate, since it doesn't take many key letters in the input string for the output to become quite large. FWIW, I'm not desperate for points, and I left school decades ago. I was just trying to help the OP. – PM 2Ring Sep 18 '18 at 18:58
  • "Special cases aren't special enough" - rather than figure out which indices need to be have multiple candidates tried, simply generate the candidates for every position (including if there is only one such candidate for the position). This may or may not be slower, but it is much simpler. – Karl Knechtel Mar 01 '23 at 22:31
3

This can be done in Perl by using the glob operator. glob is intended for finding matching files, but if no generic wildcard characters (*, ? or [...]) are included in the pattern then it will simply return all possible matches, whether they exist as files or not

This Perl code uses a substitution to form a glob pattern by replacing all occurrences of A, J, or K with the multiple pattern {A,J,K}. Submitting the result to glob gives us the required output

use strict;
use warnings 'all';
use feature 'say';

for my $s ( qw/ 27AAGCB5913L2ZF 27KBGCB5913L2ZF / ) {

    (my $patt = $s) =~ s/[AJK]/{A,J,K}/g;

    say for glob $patt;
    say "";
}

output

27AAGCB5913L2ZF
27AJGCB5913L2ZF
27AKGCB5913L2ZF
27JAGCB5913L2ZF
27JJGCB5913L2ZF
27JKGCB5913L2ZF
27KAGCB5913L2ZF
27KJGCB5913L2ZF
27KKGCB5913L2ZF

27ABGCB5913L2ZF
27JBGCB5913L2ZF
27KBGCB5913L2ZF
Borodin
  • 126,100
  • 9
  • 70
  • 144
0

Here is a recursive solution in Perl:

my $str = '27AAGCB5913L2ZF';
my @replace = qw (A J K);
print_string( $str);

sub print_string {
    my ( $str, $replace, $start) = @_;

    if (defined $replace) {
        substr( $str, $start, 1) = $replace;
    }
    else {
        $start = -1;
    }
    pos( $str) = $start +1;
    if ($str =~ /\G.*?(A|J|K)/g) {
        my $cur_start = $-[-1];
        print_string( $str, $_, $cur_start) for @replace;
    }
    else {
        say $str;
    }
}

Output:

27AAGCB5913L2ZF
27AJGCB5913L2ZF
27AKGCB5913L2ZF
27JAGCB5913L2ZF
27JJGCB5913L2ZF
27JKGCB5913L2ZF
27KAGCB5913L2ZF
27KJGCB5913L2ZF
27KKGCB5913L2ZF
Håkon Hægland
  • 39,012
  • 21
  • 81
  • 174
0

A short solution:

Start by splitting the string into tuples of characters that may appear at each position

>>> s = "27AAGCB5913L2ZF"
>>> p2 = [("A","J","K") if c in "AJK" else (c,) for c in s]
>>> p2
[('2',), ('7',), ('A', 'J', 'K'), ('A', 'J', 'K'), ('G',), ('C',), ('B',), ('5',), ('9',), ('1',), ('3',), ('L',), ('2',), ('Z',), ('F',)]

Then this function will assemble the list of tuples back into a string:

def assemble(t, s=''):
    if t:
        for c in t[0]:
            assemble(t[1:], s+c)
    else:
        print(s)

>>> assemble(p2)
27AAGCB5913L2ZF
27AJGCB5913L2ZF
27AKGCB5913L2ZF
27JAGCB5913L2ZF
27JJGCB5913L2ZF
27JKGCB5913L2ZF
27KAGCB5913L2ZF
27KJGCB5913L2ZF
27KKGCB5913L2ZF
BoarGules
  • 16,440
  • 2
  • 27
  • 44