I want to put many large long integers into memory without any space between them. How to do that with python 2.7 code in linux?
The large long integers all use the same number of bits. There is totally about 4 gb of data. Leaving spaces of a few bits to make each long integer uses multiples of 8 bits in memory is ok. I want to do bitwise operation on them later.
So far, I am using a python list. But I am not sure if that leaves no space in memory between the integers. Can ctypes help?
Thank you.
The old code uses bitarray (https://pypi.python.org/pypi/bitarray/0.8.1)
import bitarray
data = bitarray.bitarray()
with open('data.bin', 'rb') as f:
data.fromfile(f)
result = data[:750000] & data[750000:750000*2]
This works and the bitarray doesn't have gap in memory. But, the bitarray bitwise and is slower than the native python's bitwise operation of long integer by about 6 times on the computer. Slicing the bitarray in the old code and accessing an element in the list in the newer code use roughly the same amount of time.
Newer code:
import cPickle as pickle
with open('data.pickle', 'rb') as f:
data = pickle.load(f)
# data is a list of python's (long) integers
result = data[0] & data[1]
Numpy: In the above code. result = data[0] & data[1] creates a new long integer. Numpy has the out option for numpy.bitwise_and. That would avoid creating a new numpy array. Yet, numpy's bool array seems to use one byte per bool instead of one bit per bool. While, converting the bool array into a numpy.uint8 array avoids this problem, counting the number of set bit is too slow.
The python's native array can't handle the large long integers:
import array
xstr = ''
for i in xrange(750000):
xstr += '1'
x = int(xstr, 2)
ar = array.array('l',[x,x,x])
# OverflowError: Python int too large to convert to C long