The issue is now fairly clear. I think we need to examine this from a standpoint of information theory.
- The input is a string of visible characters, currently represented in 8 bits each.
- The "alphabet" for this string is alphanumeric (26+26+10 symbols), plus about 20 special and reserved characters, 80+ characters total.
- There is no apparent redundancy in the generated string.
There are three main avenues to shortening a representation, taking advantage of
- Frequency of characters (hamming): replace a frequent character with fewer than 8 bits; longer bit strings will then be needed for rare characters.
- Frequency of substrings (compression): replace a frequent substring with a single character.
- Convert to a different base: ideally, len(alphabet).
The first two methods can lengthen the resulting string, as they require starting with a translation table. Also, since your strings appear to be taken from a uniform random distribution, there will be no redundancy or commonality to leverage. When the Shannon entropy is at or near the maximum over the input tokens, there is nothing to be gained in those methods.
This leaves us base conversion. We're using 8 bits -- 256 combinations -- to represent an alphabet of only 82 characters. A simple base conversion will save about 20%; the ratio is log(82) / log(256). If you want a cheap conversion, simply map into a 7-bit representation, a saving of 12.5%
Very simply, define a symbol ordinality on your character set, such as
0123456789ABCDEFGH...YZabcd...yz:/?#[]()@!$%&'*+,;=% (81 chars)
Now, compute the numerical equivalent of a given string, just as if you were hand-coding a conversion from a decimal or hex string. The resulting large integer is the compressed value. Write it out in bytes, or chop it into 32-bit integers, or whatever fits your intermediate storage medium.