We are counting photons and time-tagging with this FPGA counter.We got about 500MB of data per minutes. I am getting 32bits of data in hex string *32-bit signed integers stored using little-endian byte order. Currently I am doing like:
def getall(file):
data1 = np.memmap(file, dtype='<i4', mode='r')
d0=0
raw_counts=[]
for i in data1:
binary = bin(i)[2:].zfill(8)
decimal = int(binary[5:],2)
if binary[:1] == '1':
raw_counts.append(decimal)
counter=collections.Counter(raw_counts)
sorted_counts=sorted(counter.items(), key=lambda pair: pair[0], reverse=False)
return counter,counter.keys(),counter.values()
I think this part ( ( No it is not. I found out by profiling my program.) Is there any way to speed it up? So far I only need the binary bits from [5:]. I don't need all 32bits. So I think the parsing the 32bits to last 27bits is taking much of the time. Thanks,binary = bin(i)[2:].zfill(8);decimal = int(binary[5:],2)
) is slowing down the process.
*Update 1
J.F.Sebastian pointed me it is not in hex string.
*Update 2
Here is the final code if any one needs it. I ended up using np.unique instead of collection counter. At the end , I converted back to collection counter because I want to get accumulative counting.
#http://stackoverflow.com/questions/10741346/numpy-most-efficient-frequency-counts-for-unique-values-in-an-array
def myc(x):
unique, counts = np.unique(x, return_counts=True)
return np.asarray((unique, counts)).T
def getallfast(file):
data1 = np.memmap(file, dtype='<i4', mode='r')
data2=data1[np.nonzero((~data1 & (31 <<1)))] & 0x7ffffff #See J.F.Sebastian's comment.
counter=myc(data2)
raw_counts=dict(zip(counter[:,0],counter[:,1]))
counter=collections.Counter(raw_counts)
return counter,counter.keys(),counter.values()
However this one looks like the fastest version for me. data1[np.nonzero((~data1 & (31 <<1)))] & 0x7ffffff
is slowing down compared to counting first and convert the data later binary = bin(counter[i,0])[2:].zfill(8)
def myc(x):
unique, counts = np.unique(x, return_counts=True)
return np.asarray((unique, counts)).T
def getallfast(file):
data1 = np.memmap(file, dtype='<i4', mode='r')
counter=myc(data1)
xnew=[]
ynew=[]
raw_counts=dict()
for i in range(len(counter)):
binary = bin(counter[i,0])[2:].zfill(8)
decimal = int(binary[5:],2)
xnew.append(decimal)
ynew.append(counter[i,1])
raw_counts[decimal]=counter[i,1]
counter=collections.Counter(raw_counts)
return counter,xnew,ynew