0

It is possible to extract any payload from if you a shell script file with the following technique (see this):

#!/bin/sh
tail -n +4 > package.tgz
exec tar zxvf package.tgz
# payload comes here...

This needs a file so tail can seek the file to the right place.

In my particular situation, to automate things further, I'm using the | sh - pattern, but it breaks payload extraction, because pipes are not seekable.

I also tried to embed binary payload into a heredoc so I could make something like:

cat >package.tgz <<END
# payload comes here
END
tar zxvf package.tgz

But it makes shells (both bash and NetBSD's /bin/sh) confused and it just doesn't work.

I could use uuencode or base64 within the heredoc but I just wanted to know if there is some shell wizardry that could be used to receive both the script and binary data from stdin and extract the binary data out of the the data received from stdin.

Edit:

When I mean the shell gets confused, I mean it can just ignore null bytes or have undefined behaviour, even within the heredoc. Try:

cat > /tmp/out <<EOF
$(echo 410041 | xxd -p -r)
EOF
xxd -p /tmp/out

Bash complains: line 2: warning: command substitution: ignored null byte in input.

If I literally embed hex bytes 410041 into the shell script and use quoted heredoc, the result is different, but bash just drops null bytes.

echo '#!/bin/sh' > foo.sh
echo "cat > /tmp/out <<'EOF'" >> foo.sh
echo 410041 | xxd -p -r >> foo.sh
echo >> foo.sh
echo EOF >> foo.sh
echo 'xxd -p /tmp/out' >> foo.sh
bash /tmp/foo.sh 
41410a
Silas
  • 392
  • 1
  • 4
  • 15
  • `exec tar zxvf` why exec? `But it makes shells (both bash and NetBSD's /bin/sh) confused` What? What does it means when the shell is "confused"? I would to `tar -xzvf - <<'EOF'` with `EOF` beeing a random uuid `if there is som...data received from stdin.` I do not understand - isn't this a XY problem? What is really that you are trying to solve? You did _not_ post an example where you read script and data from stdin - both code snippets ignore stdin completely and the "binary data" is contained within the file. – KamilCuk Jul 26 '20 at 15:34
  • I edited the question and added a clarification note at the end. – Silas Jul 26 '20 at 15:46
  • Yes, that's why quote the here document delimiter. Do `<<'EOF'` or `<<"EOF"` if you do not want the here document contents to be expanded. `not four.` I assume you want it to output `$(echo 0a0a0a0a | xxd -p -r)` literally. I think your question is too broad, you ask about "some shell wizardry", which is too broad for a stackoverflow question. – KamilCuk Jul 26 '20 at 15:53
  • I tried it with heredoc delimited with quotes as well (I added four literal 0a -- in vim they show as ^@) and the result is the same: the output is only 0a. File `/tmp/out` size is only one byte. – Silas Jul 26 '20 at 15:58
  • [Can't reproduce](https://repl.it/@kamilcukrowski/BurlyEvenSemicolon#main.sh) – KamilCuk Jul 26 '20 at 16:00
  • I though it could be a XY problem. Anyway it is still an interesting question. My real problem is: I'm dinamically generating shell script for remote configuration management, so I do `createsh | ssh host`. I'm embedding binary data to the shell script so it can be extracted to the remote host. I can `createsh > x.sh; scp x.sh host:/tmp/x.sh; ssh host 'sh /tmp/x.sh'` and it works, but I wanted to know if I can use the `createsh | ssh host` pattern in this situation. – Silas Jul 26 '20 at 16:03
  • Updated the example again. Indeed, it seems impossible to embed null bytes even in a quoted heredoc. – Silas Jul 26 '20 at 16:20
  • Easy to do if you use perl instead of sh, fwiw. – Shawn Jul 26 '20 at 21:05

2 Answers2

1

bash (and other shells) tend to "think" in C-strings, which are null-terminated, and hence cannot contain nulls (that's what indicates the end of the string). To produce nulls, you pretty much have to run some program/command that takes some safely-encoded content and produces nulls, and have its output sent directly to a file or pipe without the shell looking at it in between.

The simplest way to do this will be to encode the file with something like base64, then pipe the output from base64 -D. Something like this:

base64 -D <<'EOF' | tar xzv
H4sIAOzIHV8AA+y9DVxVVbowvs/hgAc8sY+Jhvl1VCoJBVQsETVgOIgViin2pSkq
....
EOF

If you don't want to use base64, another option would be to use bash's printf builtin to print null-containing or otherwise weird output to a pipe. It might look something like this:

LC_ALL=C
printf '\037\213\010\000\354\310\035_\000\003\354\275\015\\UU....' | tar xzv

In the above, example, I converted everything that wasn't printable ASCII to \octal codes. It should actually be ok to include almost everything as literal characters, except null, single-quote (cannot be included in a single-quoted string, probably simplest to octal-encode), backslash (just double it), and percent-sign (also double it). I don't think it'll be a problem, but it might be safest to set LC_ALL=C first, so it doesn't freak out about non-valid-UTF-8 in input strings.

Here's a quick & dirty C program to do the encoding. Note that it sends output to stdout, and it may contain junk that'll mess up your Terminal; so be sure to direct output somewhere.

#include <stdio.h>
#include <stdlib.h>

int main( int argc, char *argv[] )  {
    int ch;
    FILE *fp;

    if ( argc != 2 ) {
        fprintf(stderr, "Usage: %s infile\n", argv[0]);
        return 1;
    }

    fp = fopen(argv[1], "r");
    if (fp == NULL) {
        fprintf(stderr, "Error opening %s", argv[1]);
        return 1;
    }

    printf("#!/bin/bash\nLC_ALL=C\nprintf '");

    while((ch = fgetc(fp)) != EOF) {
        switch(ch) {
            case '\000':
                printf("\\000");
                break;
            case '\047':
                printf("\\047");
                break;
            case '%':
            case '\\':
                printf("%c%c", ch, ch);
                break;
            default:
                printf("%c", ch);
        }
    }
    fclose(fp);

    printf("' | tar xzv\n");
    return 0;
}
Gordon Davisson
  • 118,432
  • 16
  • 123
  • 151
0

if there is some shell wizardry that could be used to receive both the script and binary data from stdin and extract the binary data out of the the data received from stdin.

Having such script:

cat <<'EOF' >script.sh
#!/bin/sh
hostname
echo "What is you age?"
if ! IFS= read -r ans; then
     echo "Read failed!"
else 
     echo "You are $ans years old."
fi
xxd -p
EOF

You can pipe to remote ssh shell via process substitution with here document followed by any data you want:

{
   echo 123
   echo "This is the input"
   echo 001122 | xxd -r -p
} | {
    u=$(uuidgen)
    # Remove shell is started with a process subtitution
    # terminated with a unique mark
    echo "bash <(cat <<'$u'"
    cat script.sh
    # Note - script.sh may not read all input
    # which will then executed as commands
    # read it here and make sure nothing leaks
    echo 'cat >/dev/null'
    echo "$u"
    echo ")"
    # the process substitution is followed by input
    # note that because the upper bash "eats" all input
    # it will not execute.
    cat
} | ssh host

sample execution:

host
What is you age?
You are 123 years old.
546869732069732074686520696e7075740a001122

u No as to:

My real problem is: I'm dinamically generating shell script for remote configuration management, so I do createsh | ssh host. I'm embedding binary data to the shell script so it can be extracted to the remote host.

While you could separate two streams with a separator:

u=$(uuidgen); cat script.sh; echo; echo $u; cat binarydata.txt | ssh host bash -c 'sed "/$1/{d;q}" >script.sh; cat > binarydata.txt' _ "$u"

that is just reinventing the wheel - it already exists and is called tar:

tar -cf - script.sh binarydata.txt | ssh host bash -c 'cd /tmpdir; <unpack tar>; ./script.sh binarydata.txt; rm /tmpdir'
KamilCuk
  • 120,984
  • 8
  • 59
  • 111