For a Bioinformatics project I need to compress a large amount of bitstrings (strings containing only 0's and 1's) in Ruby to smaller strings to reduce memory usage.
So ideally a string like "0001010010010010010001001" becomes something like "2a452c66". I first used MD5 hashes until I read something about possible collisions which I would like to avoid.
I have tried a lot of different combinations of unpack, to_i, to_s, etc, but can't seem to get the right combination.
The solution should:
- Keep any leading 0's.
- Be reversible.
- Compress (obviously).
- And the output should avoid strange characters. Preferably I would like to stay in the alphanumeric space. (a-zA-Z0-9).
Thanks!