Why is unpack so slow?

Question

I have a simple example:

def __init__(self,string):
    self.string = string

def UI32(self):
    tmp = self.string[:4]
    self.string = self.string[4:]
    return unpack(">I",tmp)[0]

data = file.read()
U = UI(data)
for i in range(60000):
    test = UI32()

Total time: 22 seconds!

`range(60000)` creates a 60k-element-array. use `xrange(60000)` instead. — ThiefMaster, Jul 29 '11 at 09:31

score 5 · Answer 1 · answered Jul 29 '11 at 09:26

First of all, I cannot reproduce the 22s on my system (Intel Nehalem, 64-bit Ubuntu, Python 2.6.5).

The following takes 1.4s (this is essentially your code with some blanks filled in by me):

import struct

class UI(object):
    def __init__(self,string):
        self.string = string

    def UI32(self):
        tmp = self.string[:4]
        self.string = self.string[4:]
        return struct.unpack(">I",tmp)[0]

U = UI('0' * 240000)
for i in range(60000):
    test = U.UI32()

Now, there are several glaring inefficiencies here, especially around self.string.

I've rewritten your code like so:

import struct

class UI(object):
    def __init__(self,string):
        fmt = '>%dI' % (len(string) / 4)
        self.ints = struct.unpack(fmt, string)
    def __iter__(self):
        return iter(self.ints)

U = UI('0' * 240000)
count = 0
for test in U:
    count += 1
print count

On the same machine it now takes 0.025s.

score 3 · Answer 2 · answered Jul 29 '11 at 09:21

3

Every iteration of the 60,000 cycle loop you are copying the entire memory buffer:

self.string = self.string[4:]

It would be more efficient to simply walk through the string using indexes and at the end clear the variable.

answered Jul 29 '11 at 09:21

Steve-o

12,678
2
41
60

1

And use `xrange` instead of `range`. See [Should you always favor xrange() over range()?](http://stackoverflow.com/questions/135041/should-you-always-favor-xrange-over-range) – Paolo Moretti Jul 29 '11 at 09:28

Why is unpack so slow?

2 Answers2