3

I have seen some installation files (huge ones, install.sh for Matlab or Mathematica, for example) for Unix-like systems, they must have embedded quite a lot of binary data, such as icons, sound, graphics, etc, into the script. I am wondering how that can be done, since this can be potentially useful in simplifying file structure.

I am particularly interested in doing this with Python and/or Bash.

Existing methods that I know of in Python:

  1. Just use a byte string: x = b'\x23\xa3\xef' ..., terribly inefficient, takes half a MB for a 100KB wav file.
  2. base64, better than option 1, enlarge the size by a factor of 4/3.

I am wondering if there are other (better) ways to do this?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
qed
  • 22,298
  • 21
  • 125
  • 196

2 Answers2

2

You can use base64 + compression (using bz2 for instance) if that suits your data (e.g., if you're not embedding already compressed data).

For instance, to create your data (say your data consist of 100 null bytes followed by 200 bytes with value 0x01):

>>> import bz2
>>> bz2.compress(b'\x00' * 100 + b'\x01' * 200).encode('base64').replace('\n', '')
'QlpoOTFBWSZTWcl9Q1UAAABBBGAAQAAEACAAIZpoM00SrccXckU4UJDJfUNV'

And to use it (in your script) to write the data to a file:

import bz2
data = 'QlpoOTFBWSZTWcl9Q1UAAABBBGAAQAAEACAAIZpoM00SrccXckU4UJDJfUNV'
with open('/tmp/testfile', 'w') as fdesc:
    fdesc.write(bz2.decompress(data.decode('base64')))
Pierre
  • 6,047
  • 1
  • 30
  • 49
  • Nice, could you give a small example? – qed Dec 23 '14 at 00:39
  • 1
    This answer doesn't work in Python 3, where the `bytes.encode` and `str.decode` methods do not exist (their types would be backwards from how text encoding works). You could `import base64` and use `base64.b64decode(data)` instead of `data.decode('base64')`, and similarly use `base64.b64encode` when creating the string. – rspeer Apr 07 '20 at 22:32
1

Here's a quick and dirty way. Create the following script called MyInstaller:

#!/bin/bash

dd if="$0" of=payload bs=1 skip=54

exit

Then append your binary to the script, and make it executable:

cat myBinary >> myInstaller
chmod +x myInstaller

When you run the script, it will copy the binary portion to a new file specified in the path of=. This could be a tar file or whatever, so you can do additional processing (unarchiving, setting execute permissions, etc) after the dd command. Just adjust the number in "skip" to reflect the total length of the script before the binary data starts.

Ivan X
  • 2,165
  • 16
  • 25
  • A frequent use is a shell script to unzip the tarbal in the right place and some additional checks. Java packages for Linux were built like that. – mcoolive Dec 22 '14 at 13:59