So I have several very large files which represent each position in the human genome. Both of these files are binary masks for a certain type of "score" for each position in the genome and I am interested in getting a new mask where both scores are "1" i.e. the intersection of the two masks.
For example:
File 1: 00100010101
File 2: 11111110001
Desired output: 00100010001
In python, it is really fast to read these big files (they contain between 50-250 million characters) into strings. However, I can't just &
the strings together. I CAN do something like
bin(int('0001',2) & int('1111', 2))
but is there a more direct way that doesn't require that I pad in the extra 0's and convert back to a string in the end?