Reading binary file bit by bit python

Question

I may need your help translating the bitstream of the N-MNIST dataset which is saved as a binary .bin file.

For that, each event occupies 40bit described as:

bit 39 - 32: Xaddress (in pixels)

bit 31 - 24: Yaddress (in pixels)

bit 23: Polarity (0 for OFF, 1 for ON)

bit 22 - 0: Timestamp (in microseconds)

A Matlab function is given:

% TD = Read_Ndataset(filename)
% returns the Temporal Difference (TD) events from binary file for the
% N-MNIST and N-Caltech101 datasets. See garrickorchard.com\datasets for
% more info
function TD = Read_Ndataset(filename)
eventData = fopen(filename);
evtStream = fread(eventData);
fclose(eventData);

TD.x    = evtStream(1:5:end)+1; %pixel x address, with first pixel having index 1
TD.y    = evtStream(2:5:end)+1; %pixel y address, with first pixel having index 1
TD.p    = bitshift(evtStream(3:5:end), -7)+1; %polarity, 1 means off, 2 means on
TD.ts   = bitshift(bitand(evtStream(3:5:end), 127), 16); %time in microseconds
TD.ts   = TD.ts + bitshift(evtStream(4:5:end), 8);
TD.ts   = TD.ts + evtStream(5:5:end);
return

So I tried implementing it in python... I have a function to read the raw binary data byte by byte into an array:

 ---------------------------------READ BYTES----------------------------#

 def read_bytes(filepath):

     eventData = open(filepath, 'rb')
     evtStream = eventData.read()
     eventData.close()
     bytes=[]
     with open(filepath,'rb') as file:
         while True:
             b = file.read(1)
             if not b:
                 break
             bytes.append(int.from_bytes(b, byteorder='big'))
         return(bytes)

This array is used in my main function... I thought if each event consists of 40bit it should be 5bytes (40/8=5). As there aint a option to read bit by bit in python I could read the bytes one by one and then save them and then read them as bits:

  def read_bits(this_byte):
      this_bits = []
      for i in range(8):              
           this_bits.append(bit_from_string(str(this_byte),i))
      return(this_bits)

Then save those bits for the first 5 bytes into an array to access them in the supposed order:

 #---------------------------------MAIN----------------------------#
 def read_matlab(filepath):
    bytes = read_bytes(filepath)
    bits = []
    for j in range(5):
        current_byte = bytes[j]
        current_bits = read_bits(current_byte)
        bits.append(current_bits)
    bits = np.reshape(bits,(1,40))
    print(bits)

     TD_x=[]
     TD_y=[]
     TD_pol=[]
     TD_time=[]

     for z in range(40):
         if z <= 22:
             TD_time.append(bits[0][z])
         elif z == 23:
             TD_pol.append(bits[0][z])
         elif z <= 31:
             TD_y.append(bits[0][z])
         elif z <= 40:
             TD_x.append(bits[0][z])
     x = trans_bits(TD_x)
     y = trans_bits(TD_y)
     pol = trans_bits(TD_pol)
     time = trans_bits(TD_time)
     print('X: ',TD_x)
     print('Y: ',TD_y)
     print('POL: ',TD_pol)
     print('TIME: ',TD_time)


     print('X: ',x)
     print('Y: ',y)
     print('POL: ',pol)
     print('TIME: ',time/10000)

For that I used another function to translate the zeros and ones to a int value:

 def trans_bits(TD):
     value = 0
     cur_val=0
     for i in range(len(TD)):
         if TD[i] == 1:
             cur_val = 2**i
             value += cur_val

     return(value)

So I tried this function for just the first 5 bytes with the following output in terminal:

 [[1 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 0 0 1 1 0 0]]
 X:  [1, 0, 0, 0, 1, 1, 0, 0]
 Y:  [1, 1, 0, 0, 1, 1, 0, 0]
 POL:  [0]
 TIME:  [1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0]
 X:  49
 Y:  51
 POL:  0
 TIME:  322.4369

but all the values are wrong as the dimensions for x and y must be max. 28 and not 49/51... So there must be an error in reading the bits and I don't have a clue why...

Umm... isn't that just 4 bytes (eg 32 bits each record)? Not 40 bits? — Jon Clements, Jul 04 '22 at 22:46
Hey thx for your reply. In the documentation (https://www.garrickorchard.com/datasets/n-mnist) it is described as: Each example is a separate binary file consisting of a list of events. Each event occupies 40 bits as described below: bit 39 - 32: Xaddress (in pixels) bit 31 - 24: Yaddress (in pixels) bit 23: Polarity (0 for OFF, 1 for ON) bit 22 - 0: Timestamp (in microseconds) — helpingHand34, Jul 04 '22 at 22:54
Anyway - it'd be helpful if you could [edit] to include some sample data for what the first 5 rows are? eg: `print(repr((your_file.read(5 * 32))` and what your current output is — Jon Clements, Jul 04 '22 at 22:55
[**`struct.unpack`**](https://docs.python.org/3/library/struct.html#struct.unpack) is your friend. — Peter Wood, Jul 04 '22 at 22:55
@Peter yup... `struct.iter_upack` with a bit of datetime conversion probably is it... just trying to make sure :p — Jon Clements, Jul 04 '22 at 22:56
https://tonic.readthedocs.io/en/latest/reference/datasets.html#n-mnist The source for this class will help — jkr, Jul 04 '22 at 22:58

Reading binary file bit by bit python

0 Answers0