55

Using StringIO as string buffer is slower than using list as buffer.

When is StringIO used?

from io import StringIO


def meth1(string):
    a = []
    for i in range(100):
        a.append(string)
    return ''.join(a)

def meth2(string):
    a = StringIO()
    for i in range(100):
        a.write(string)
    return a.getvalue()


if __name__ == '__main__':
    from timeit import Timer
    string = "This is test string"
    print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
    print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())

Results:

16.7872819901
18.7160351276
Brian Burns
  • 20,575
  • 8
  • 83
  • 77
simha
  • 869
  • 1
  • 8
  • 12

4 Answers4

35

The main advantage of StringIO is that it can be used where a file was expected. So you can do for example (for Python 2):

import sys
import StringIO

out = StringIO.StringIO()
sys.stdout = out
print "hi, I'm going out"
sys.stdout = sys.__stdout__
print out.getvalue()
Brian Burns
  • 20,575
  • 8
  • 83
  • 77
TryPyPy
  • 6,214
  • 5
  • 35
  • 63
  • Can it be used with `with` in python 2 ? From what I see here no: http://bugs.python.org/issue1286 – Mr_and_Mrs_D Dec 29 '14 at 22:19
  • @Mr_and_Mrs_D please see [http://bugs.python.org/issue1286#msg176512](http://bugs.python.org/issue1286#msg176512) which states that it will work from 2.5 up. What more do you want, blood on it? :D – Mark Lawrence Jul 02 '16 at 22:45
  • @MarkLawrence: no it won't - reread the comment you linked - you have to roll _your own_ context manager – Mr_and_Mrs_D Jul 03 '16 at 13:19
27

If you measure for speed, you should use cStringIO.

From the docs:

The module cStringIO provides an interface similar to that of the StringIO module. Heavy use of StringIO.StringIO objects can be made more efficient by using the function StringIO() from this module instead.

But the point of StringIO is to be a file-like object, for when something expects such and you don't want to use actual files.

Edit: I noticed you use from io import StringIO, so you are probably on Python >= 3 or at least 2.6. The separate StringIO and cStringIO are gone in Py3. Not sure what implementation they used to provide the io.StringIO. There is io.BytesIO too.

plundra
  • 18,542
  • 3
  • 33
  • 27
  • Try it with `cStringIO`. Results: List: 17, cString: 33. – user225312 Jan 19 '11 at 09:59
  • 3
    io.StringIO is a C implementation, if that exists on your platform. If not it uses a Python implementation fallback. The reason it's slower is because he is doing something that he doesn't need StringIO for in the first place. – Lennart Regebro Jan 19 '11 at 10:14
  • 1
    The module `cStringIO` has been [removed](https://docs.python.org/3/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit) in Python 3. – Jeyekomon Aug 02 '21 at 08:47
18

Well, I don't know if I would like to call that using it as a "buffer", you are just multiplying a string a 100 times, in two complicated ways. Here is an uncomplicated way:

def meth3(string):
    return string * 100

If we add that to your test:

if __name__ == '__main__':

    from timeit import Timer
    string = "This is test string"
    # Make sure it all does the same:
    assert(meth1(string) == meth3(string))
    assert(meth2(string) == meth3(string))
    print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
    print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
    print(Timer("meth3(string)", "from __main__ import meth3, string").timeit())

It turns out to be way faster as a bonus:

21.0300650597
22.4869811535
0.811429977417

If you want to create a bunch of strings, and then join them, meth1() is the correct way. There is no point in writing it to StringIO, which is something completely different, namely a string with a file-like stream interface.

Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
-1

Another approach based on Lennart Regebro approach. This is faster than list method (meth1)

def meth4(string):
    a = StringIO(string * 100)
    contents = a.getvalue()
    a.close()
    return contents

if __name__ == '__main__':
    from timeit import Timer
    string = "This is test string"
    print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
    print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
    print(Timer("meth3(string)", "from __main__ import meth3, string").timeit())
    print(Timer("meth4(string)", "from __main__ import meth4, string").timeit())

Results (sec.):

meth1 = 7.731315963647944

meth2 = 9.609279402186985

meth3 = 0.26534052061106195

meth4 = 2.915035489152274

Community
  • 1
  • 1
  • 1
    I don't think there could ever be any use for wrapping a string in `StringIO` just to immediately convert it back to a string again and discard the `StringIO` object, especially if you care about runtime. – Nathan Jan 29 '19 at 20:05