3

What would be "better", making multiple calls to write on file with small strings or building a string to make one call to write on file at the end?

# multiple small strings
with open('example1.txt', 'w') as f:
    for i in some_data:
        s = some_function(i)
        f.write(s)
        if some_condition(i):
            f.write(str(i))
        f.write('\n')

# one long string
res = ''
for i in some_data:
    s = some_function(i)
    res += s
    if some_condition(i):
        res += str(i)
    res += '\n'
with open('example2.txt', 'w') as f:        
    f.write(res)

Assuming the chunks of data i are float.

izxle
  • 386
  • 1
  • 3
  • 19
  • How are you measuring "better"? – Scott Hunter May 15 '17 at 23:09
  • 1
    Furthermore, string concatenation with `+` performance [depends on Python version](http://stackoverflow.com/a/24718551/1270789), it seems. My question is, though, "Does it really matter?" Personally, I'd write out one string at a time until it was proven that this was an issue. – Ken Y-N May 15 '17 at 23:28

1 Answers1

0

If you're asking which method is "better" I would answer with these questions:

  1. Which is easier to read?
  2. Which is easier to write?
  3. Which is easier to maintain?

Very very very rarely, you will come across a problem that needs to have another question answered:

  1. Which is more performant?

This is less than 1% of the time. Do not assume that this is your situation unless you have a very very VERY good reason to believe it is. In which case, it still probably isn't your situation. On the offhand chance that it is your situation, you're going to need to run some tests. There are a lot of things that can impact your answer which can include the amount of memory you have, the speed of your memory, the speed of your CPU, the size of your CPU cache, the size of your L2 cache, the performance of your hard drive. Two different machines could very likely perform differently for these two different approaches.

You will need to run tests by implementing both versions and running it through hundreds, thousands, millions of iterations (depending on your situation)

You could also have a situation where you are writing a very large file. If your text file is 15GB and you only have 4GB of RAM on your device, it doesn't matter how much faster it might be to build up the string in memory first, you will need to stream it out line by line (or chunk by chunk) to the disk.

In general, go for the simple solution whenever possible. Don't waste your time asking yourself if it might be a bit more efficient one way vs the other.

And of course, when in doubt, you can always run import this

>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
ajb
  • 479
  • 5
  • 8