1

I have the following dict which I want to write to a file in binary:

data = {(7, 190, 0): {0: 0, 1: 101, 2: 7, 3: 0, 4: 0}, 
        (7, 189, 0): {0: 10, 1: 132, 2: 17, 3: 20, 4: 40}}

I went ahead to use the struct module in this way:

packed=[]
for ssd, add_val in data.iteritems():
    # am trying to using 0xcafe as a marker to tell me where to grab the keys 
    pack_ssd = struct.pack('HBHB', 0xcafe, *ssd) 
    packed.append(pack_ssd)
    for add, val in data[ssd].iteritems():
        pack_add_val = struct.pack('HH', add, val)
        packed.append(pack_add_val)

The output of this is packed = ['\xfe\xca\x07\x00\xbe\x00\x00', '\x00\x00\x00\x00', '\x01\x00e\x00', '\x02\x00\x07\x00', '\x03\x00\x00\x00', '\x04\x00\x00\x00', '\xfe\xca\x07\x00\xbd\x00\x00', '\x00\x00\n\x00', '\x01\x00\x84\x00', '\x02\x00\x11\x00', '\x03\x00\x14\x00', '\x04\x00(\x00']

After which I write this as a binary file :

ifile = open('test.bin', 'wb')
for pack in packed:
    ifile.write(pack)

Here is what the binary file looks like: '\xfe\xca\x07\x00\xbe\x00\x00\x00\x00\x00\x00\x01\x00e\x00\x02\x00\x07\x00\x03\x00\x00\x00\x04\x00\x00\x00\xfe\xca\x07\x00\xbd\x00\x00\x00\x00\n\x00\x01\x00\x84\x00\x02\x00\x11\x00\x03\x00\x14\x00\x04\x00(\x00'

It's all OK until I tried to unpack the data. Now I want to read the contents of the binary file and arrange it back to how my dict looked liked in the first place. This is how I tried to unpack it but I was always getting an error:

unpack=[]
while True:
chunk = ifile.read(log_size)
if len(chunk) == log_size:
    str = struct.unpack('HBHB', chunk)
    unpack.append(str)
    chunk = ifile.read(log1_size)
    str= struct.unpack('HH', chunk)
    unpack.append(str)

Traceback (most recent call last):
File "<interactive input>", line 7, in ?
error: unpack str size does not match format

I realize the method I tried to unpack will always run into problems, but I can't seem to find a good way in unpacking the contents of the binary file. Any help is much appreciated..

pcurry
  • 1,374
  • 11
  • 23
user2345778
  • 31
  • 1
  • 6
  • 1
    Unless this is homework, http://stackoverflow.com/questions/8968884/python-serialization-why-pickle http://docs.python.org/2/library/pickle.html http://docs.python.org/2/library/marshal.html – Patashu May 07 '13 at 03:44

1 Answers1

1

If you need to write something custom, I would suggest doing the following:

1) 64 bit integer: Number of keys

2) 64 bit integer * 3 * number of keys: Key tuple data

for i in number of keys:

3i) 64 bit integer: Number of keys for dictionary i

4i): 64 bit integer * 2 * number of keys for i: key data, value data, key data, value data...

After that, just make sure you read and write with the same endianness and that specifying an invalid length at any point (too high, too low) doesn't crash your program and you are good.

The idea is that at any state in the unpacker it is either expecting a length or to read data as something, and so it is 100% unambiguous where everything starts and ends as long as you follow the format.

Patashu
  • 21,443
  • 3
  • 45
  • 53
  • hi patashu, yes am looking at writing a custom function for this.. do you think you give me an eg on your suggestion based on my data dict above, it looks like sth that will work for me, thk you for ya time! – user2345778 May 07 '13 at 04:16
  • @user2345778 The only hard part should be finding a function that formats a python number as a 64 bit integer binary data (or whatever number of bits you know will not be exceeded). – Patashu May 07 '13 at 04:30