I have a large text file, (3 to 6 Gb) of only two ASCII characters. I would like to convert this string into a boolean output which can be written as a simple binary file.
Take the toy 'test.bin' file below which is 568 bytes ASCII. There are 70*8, 560 characters. Every '0' and '1' is a character encoded by 1 byte. I'd like the final output to be reduced to a 560 bit file (70 byte) file.
0111000110000000101000100000100100011111010010101000001001010000111000
1001100011010100001101110000100010000010000000000001011000010011111100
0100001000010000010000010111011101011111000111111000111001100010100011
0011101000100001111111000001111110111111101101100000011000010101100001
0000000110110001000000000001000011110100000101101000001000010001010011
1101101111010101011110001110000010011001100101101101000111111101110101
1000001100101101010111110111110101100000000011001000100000000011001110
0101101001110010011110000100101001001111010011100100001001111111100110
...
I've found several solutions going the other way, converting a binary file into ASCII but not the other way, or incorrectly expanding the binary characters into their ASCII encoding 1 --> 0011001, 0 --> 0011000. I found a C++ solution, but I'm looking for a simple bash or python script.
=====================================================
Bash solution based on a small correction from here
cat test.bin | tr -d '[\n]' | perl -lpe '$_=pack"B*",$_' > true_binary.txt