0

I'm trying to mask sensitive values like keys, token, etc... in a command line before writing it into a log file.
For example:

MY_TOKEN='MaskThisPlease'
MY_CMD="my_command ${MY_TOKEN} SomeOtherString"
echo "${MY_CMD}" | gawk -v k="${MY_TOKEN}" -v m="XXXXXX" '{gsub(k,m);print}'

The result is:

my_command XXXXXX SomeOtherString

Now when MY_TOKEN contains specific characters, the masking will not work and/or will produce an error. Some of the special characters are: $ ^ * ( ) + [ ] | \ ?

The below works fine

MY_TOKEN='MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..zzzzzzzzzzzzzzzzzzzzzzzzz'
MY_CMD="my_command ${MY_TOKEN} SomeOtherString"
echo "${MY_CMD}" | awk -v k="${MY_TOKEN}" -v m="XXXXXX" '{gsub(k,m);print}'

I tried sed but there's the delimiter limitation, i.e. when the '+' is in MY_TOKEN

MY_TOKEN='MaskThisPlease_SomeABCs_Some123s_SomeOther+PlusSign+here'
MY_CMD="my_command ${MY_TOKEN} SomeOtherString"
echo "${MY_CMD}" | sed "s+${MY_TOKEN}+XXXXXX+g"

sed: -e expression #1, char 55: unknown option to `s'

So, my question is

Is there another way to perform a masking without hitting the above and without size limitations (MY_TOKEN maybe 700 characters)?

The below was added later in response of your comments and Answer 1:
I just join Stack Overflow and it's my first posting. I was unable to attach my test data as r.dat. Each line in r.dat is the value of my token (see answer 1 below for more details).

MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..zzzzzzzzzzzzzzzzzzzzzzzzz
AlphaNumericCharacters1234567890
''MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..zzzzzzzzzzzzzzzzzzzzzzzzz''
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^(zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?*zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?**zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?*)zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?*.zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?.*zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?*(zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?\\zzzzzzzzzzzzzzzzzzzzzzzzz
MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?\zzzzzzzzzzzzzzzzzzzzzzzzz

Here's the 1st code snippet which shows some limitations in the r.awk and in bash build-in replace functions

clear
# Read one by one my test cases
I=0
cat ./r.dat | while IFS= read -r MY_TOKEN
do
    I=$((I+1))

    echo
    echo "#########################################################"
    echo "${I}) MyToken is ${MY_TOKEN}"
    echo "#########################################################"

    # Build my command including the value of my token
    MY_CMD="my_command ${MY_TOKEN} SomeOtherString"

    printf "Mask with gawk function: "
      echo "${MY_CMD}" | gawk -F'\n' -v k="${MY_TOKEN}" -v m="XXXXXX" -f ./r.awk 2>/dev/null

    printf "Mask with bash built-in: "
      echo "${MY_CMD/${MY_TOKEN}/XXXXXX}"
done

The test case 19 show that the r.awk does better than the bash build-in replace function
The test case 20 show that both fail
The test case 21 show that both fail

Here's the 2nd code snippet which shows how to fix the issues in both r.awk and bash build-in replace functions

clear
# Read one by one my test cases
I=0
cat ./r.dat | while IFS= read -r MY_TOKEN
do
    I=$((I+1))

    # Escape RegEx characters
    MY_TOKEN_ESCAPED=$(echo ${MY_TOKEN} | sed 's:[][\/.^$*]:\\&:g')

    echo
    echo "#########################################################"
    echo "${I}) MyToken is ${MY_TOKEN}"
    echo "${I}) Escaped is ${MY_TOKEN_ESCAPED}"
    echo "#########################################################"

    # Build my command including the value of my token
    MY_CMD="my_command ${MY_TOKEN} SomeOtherString"

    printf "Mask with gawk function: "
      echo "${MY_CMD}" | gawk -F'\n' -v k="${MY_TOKEN_ESCAPED}" -v m="XXXXXX" -f ./r.awk 2>/dev/null

    printf "Mask with bash built-in: "
      echo "${MY_CMD/${MY_TOKEN_ESCAPED}/XXXXXX}"
done

May be there's a better & simple way to handle it and that's 100% POSIX....

slitvinov
  • 5,693
  • 20
  • 31
  • Forgot to mention, echo "${MY_CMD/${MY_TOKEN}/XXXXXX}" works fine, ShellCheck is ok with it but checkbashisms complains with (${parm/?/pat[/str]}). I'm just trying to be as portable as possible, hope you can help :) – Roberto G. Jul 30 '16 at 18:27
  • 1
    Here: http://unix.stackexchange.com/questions/129059/how-to-ensure-that-string-interpolated-into-sed-substitution-escapes-all-metac – Kusalananda Jul 30 '16 at 22:15
  • See also http://stackoverflow.com/q/29613304/1745001. Do you WANT a regexp match? If so use sed or a regexp operation in awk. If not and you actually want a string match then use awk with string functions. [edit] your question to include concise, testable sample input and expected output so we can help you. – Ed Morton Jul 31 '16 at 01:11
  • Thank you @EdMorton, I've wrapped a bunch of test cases. I used the r.awk from slitvinov and wrote two code snippets to demonstrate what's causing the issue and what can be done to fix it. See details in Answer 1. – Roberto G. Jul 31 '16 at 05:42
  • Don't just say "it fails", tell us how it fails so we don't have to try to work it out. Your `echo` itself might be interpreting characters (hint: use `printf` instead). To learn more about the problem you are having, see http://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice, http://stackoverflow.com/questions/29613304/is-it-possible-to-escape-regex-metacharacters-reliably-with-sed/29626460#29626460, and read the books Effective Awk Programming, 4th Edition, by Arnold Robbins and Shell Scripting Recipes by Chris Johnson. – Ed Morton Jul 31 '16 at 16:08
  • Thanks @EdMorton, the 1st code snippet behaves differently when executed on the prompt or when the same code is in shell script for the test case 19, no change for 20 & 21. The interpretation of the characters combination causing the issues are: *( \\ \z. I will add the new version of the code using the same test cases shortly (just saw the "Answer Your Question" button :)) – Roberto G. Jul 31 '16 at 19:49

3 Answers3

1

You can use index (its second argument is a string not a regular expression) and read a key from a file. Here is an example. Create four files:

r.awk

BEGIN {
    getline k < kfile; close(kfile)
    n = length(k)
}

function process(rst,   i, pre, ans) {
    while (i=index(rst, k)) {
        pre = substr(rst, 1,  i-1)
        rst = substr(rst, i + n)
        ans = ans pre m
    }
    return ans rst
}

{
    print process($0)
}

r.sh

awk -v kfile=r.key -v m=XXX  -f r.awk r.dat

r.dat

test\t\tMaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..zzzzzzzzzzzzzzzzzzzzzzzzztest

r.key

\t\tMaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..zzzzzzzzzzzzzzzzzzzzzzzzz

Run it with sh r.sh

Expected output:

testXXXtest
slitvinov
  • 5,693
  • 20
  • 31
0

It looks like this is what you're trying to do:

$ cat tst.sh
while IFS= read -r my_token
do
    old_my_cmd="my_command $my_token SomeOtherString"

    new_my_cmd=$(
        printf '%s\n' "$old_my_cmd" |
        awk -v m='XXXXXX' '
            BEGIN {
                my_token=ARGV[1]; ARGV[1]=""; ARGC--
                lgth = length(my_token)
            }
            {
                while ( start = index($0,my_token) ) {
                    printf "%s%s", substr($0,1,start-1), m
                    $0 = substr($0,start+lgth)
                }
                print
            }
        ' "$my_token"
    )

    printf 'old_my_cmd="%s"\n' "$old_my_cmd"
    printf 'new_my_cmd="%s"\n' "$new_my_cmd"
    printf "\n"

done < r.dat

.

$ ./tst.sh
old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command AlphaNumericCharacters1234567890 SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command ''MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..zzzzzzzzzzzzzzzzzzzzzzzzz'' SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^(zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?*zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?**zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?*)zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?*.zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?.*zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?*(zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?\\zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"

old_my_cmd="my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?\zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString"
new_my_cmd="my_command XXXXXX SomeOtherString"
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
-1

Here's the 3rd code snippet combines @EdMorton's approach into my original code

# Read one by one my test cases
I=0
cat ./r.dat | while IFS= read -r MY_TOKEN
do
    I=$((I+1))

    echo
    echo "#########################################################"
    printf '%s) MyToken is %s\n' "${I}" "${MY_TOKEN}"
    echo "#########################################################"

    # Build my command including the value of my token
    MY_CMD="my_command ${MY_TOKEN} SomeOtherString"

    MY_CMD_MASKED=$(
        printf '%s\n' "${MY_CMD}" |
        awk -v m='XXXXXX' '
            BEGIN {
                my_token=ARGV[1]; ARGV[1]=""; ARGC--
                lgth = length(my_token)
            }
            {
                while ( start = index($0,my_token) ) {
                    printf "%s%s", substr($0,1,start-1), m
                    $0 = substr($0,start+lgth)
                }
                print
            }
        ' "${MY_TOKEN}"
    )

    printf 'Mask with gawk function: %s\n' "${MY_CMD_MASKED}"
    printf 'Mask with bash built-in: %s\n' "${MY_CMD/${MY_TOKEN}/XXXXXX}"
done

Here are the results of test cases 19, 20 and 21 we discussed before:

#########################################################
19) MyToken is MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?*(zzzzzzzzzzzzzzzzzzzzzzzzz
#########################################################
Mask with gawk function: my_command XXXXXX SomeOtherString
Mask with bash built-in: my_command XXXXXX SomeOtherString

#########################################################
20) MyToken is MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?\\zzzzzzzzzzzzzzzzzzzzzzzzz
#########################################################
Mask with gawk function: my_command XXXXXX SomeOtherString
Mask with bash built-in: my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?\\zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString

#########################################################
21) MyToken is MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?\zzzzzzzzzzzzzzzzzzzzzzzzz
#########################################################
Mask with gawk function: my_command XXXXXX SomeOtherString
Mask with bash built-in: my_command MaskThisPlease_SomeABCs_Some123s_SomeOther//!!@@##%%_--==``~~{{}}::;;""<<>>,,..$^()+[]|?\zzzzzzzzzzzzzzzzzzzzzzzzz SomeOtherString

When removing the below line from the above script, it passes the "checkbashisms -fx" as well as the http://www.shellcheck.net/

printf 'Mask with bash built-in: %s\n' "${MY_CMD/${MY_TOKEN}/XXXXXX}"

The above line was there only to demonstrate the differences between the two approaches.
Thank you all for the quick feedback and helping me improving my communication in this post.