1

I want to generate N letter permutations of a given string using bash scripting

I have successfully written a code to find the permutation of a given word. However, I can't seem to get the permutation up to N letters. Note: I have to avoid the use of sed and awk commands. Here's what I have tried:

#!/bin/bash
x=$1
counter=0
ARRAY=()
function permutate {
    if [ "${#1}" = 1 ]; then
       echo "${2}${1}"
    ARRAY+=("${2}${1}")
    else
        for i in $(seq 0 $((${#1}-1)) ); do
            pre="${2}${1:$i:1}"
            seg1="${1:0:$i}"
            seg2="${1:$((i+1))}"
            seg="${seg1}${seg2}"
            permutate "$seg" "$pre"
        done
    fi
}
permutate $x

For example, if I have a word "JACK" and I want 3 letter permutations then it should give: JAC KJA JAK etc... But I can't seem to get to the bottom of it.

Muhammad Umer
  • 69
  • 1
  • 11
  • 1
    sed has absolutely nothing to do with a task like this but the whole thing should just be done in awk or you'll spend hours of your life waiting for the shell script to return for a reasonably long word - why are you trying to avoid awk? – Ed Morton May 05 '19 at 19:28
  • 1
    I wonder if it's one of your fellow students who asked https://stackoverflow.com/questions/55991544/how-to-write-to-array-in-bash/55993952 (about a different problem encountered while trying to attack the same problem) earlier today? – Charles Duffy May 05 '19 at 21:46

4 Answers4

3

Permutations of letters can be obtained by brace-expansions. Example:

$ echo {A,B}{A,B}
AA AB BA BB

So the idea is to exploit this brace expansion a bit. Let's say you have a string str, then you can obtain a brace expansion as:

$ str="JACK"
$ eval echo "{$(echo "$str" | fold -w1 | paste -sd,)}"
J A C K

You can see step by step what this does:

$ echo "$str" | fold -w1 | paste -sd,
J,A,C,K
  • fold -w1 places each character of $str$ on a single line
  • paste -sd, merges all lines on a single line with a comma in-between.

We need this combination as we cannot use sed. The command eval will, in the end, enforce the brace expansion.

The key is now to repeat the brace expansion n-times. For this, we make use of printf. If you have a string "foo" you can repeat it n times with printf in the following way:

$ printf "foo%.0s" {1..3}
foofoofoo

So, all permutations, with repetitions, can be found as:

$ str="JACK"
$ n=3
$ bracestring=$(printf "{$(echo "$str" | fold -w1 | paste -sd,)}%.0s" $(seq 1 $n))
$ eval echo $bracestring
JJJ JJA JJC JJK JAJ JAA JAC JAK JCJ JCA JCC JCK JKJ JKA JKC JKK AJJ AJA AJC AJK AAJ AAA AAC AAK ACJ ACA ACC ACK AKJ AKA AKC AKK CJJ CJA CJC CJK CAJ CAA CAC CAK CCJ CCA CCC CCK CKJ CKA CKC CKK KJJ KJA KJC KJK KAJ KAA KAC KAK KCJ KCA KCC KCK KKJ KKA KKC KKK
kvantour
  • 25,269
  • 4
  • 47
  • 72
2

Here's a start using an awk implementation of Heap's algorithm to generate permutations of maxLgth substrings from a list of strings:

$ cat npermutations.awk
function get_perm(A,            i, lgth, sep, str) {
    lgth = length(A)
    lgth = (lgth > maxLgth ? maxLgth : lgth)
    for (i=1; i<=lgth; i++) {
        str = str sep A[i]
        sep = " "
    }
    return str
}

function swap(A, x, y,  tmp) {
    tmp  = A[x]
    A[x] = A[y]
    A[y] = tmp
}

function generate(n, A, B,      i) {
    if (n == 1) {
        B[get_perm(A)]
    }
    else {
        for (i=1; i <= n; i++) {
            generate(n - 1, A, B)
            if ((n%2) == 0) {
                swap(A, 1, n)
            }
            else {
                swap(A, i, n)
            }
        }
    }
}

function get_perms(A,B, lgth) {
    lgth = length(A)
    maxLgth = (maxLgth ? maxLgth : lgth)
    generate(lgth, A, B)
}

###################

# Input should be a list of strings
{
    split($0,A)
    delete B
    get_perms(A,B)
    PROCINFO["sorted_in"] = "@ind_str_asc"
    for (perm in B) {
        print perm
    }
}

eg using sed to convert a word into a list of strings as the awk script expects:

$ echo jack | sed 's/./ &/g' | awk -v maxLgth=3 -f npermutations.awk
a c j
a c k
a j c
a j k
a k c
a k j
c a j
c a k
c j a
c j k
c k a
c k j
j a c
j a k
j c a
j c k
j k a
j k c
k a c
k a j
k c a
k c j
k j a
k j c

Convert to shell if that's really what you want to do but hopefully you can see the structure of how to approach the problem. The above uses GNU awk for sorted_in to sort the output but you don't need that.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • 1
    I am not allowed to use awk – Muhammad Umer May 05 '19 at 20:05
  • Right you said that but I'm showing you an algorithm to do it in awk (since that's one tool you REALLY could use if you had to do this in earnest) which you can then translate to shell for homework purposes. Create shell functions instead of awk functions, use `local var` to declare local variables (the ones after large spaces in the awk function params list that aren't actually provided in the calling code) and make whatever other obvious syntactic tweaks are necessary to convert the awk syntax to shell syntax. – Ed Morton May 05 '19 at 20:17
1

Print all the unique permutations by using your existing code.

for VAR in "${ARRAY[@]}"; do
    echo ${VAR:0:3}
done

JAC
JAK
JCA
JCK
JKA
JKC
AJC
AJK
ACJ
ACK
AKJ
AKC
CJA
CJK
CAJ
CAK
CKJ
CKA
KJA
KJC
KAJ
KAC
KCJ
KCA
Zack
  • 294
  • 1
  • 10
1

To limit each permutation to 3 characters we can use grep. The limitation to three characters might introduce duplicates. We use sort -u to remove these duplicates.

yourPermutationFunction JACK | grep -Eo '^.{3}' | sort -u

This approach is a bit inefficient as it generates all permutations just to throw some of them away. However, since you are using bash and a recursive function I think that you don't care much about efficiency.

By the way: The command crunch can generate permutations of a word; probably much faster than any pure bash function ever could:

crunch 0 0 -p JACK | grep -Eo '^.{3}' | uniq

Here we can use uniq instead of sort -u because crunch always generates the permutations in a sorted order.
To suppress crunch's info output add 2>&- directly before the first |.

Socowi
  • 25,550
  • 3
  • 32
  • 54