2

I've been surprised by the results provided by %timeit for these two implementations :

def f1():                                                                       
    s = ''                                                                    
    for i in range(len(values)):                                                
        s += str(values[i][0])                                                
        s += '\t'                                                             
        s += str(values[i][1])                                    
        s += '\r\n'                                                           
    return s  

and

def f2():                                                                       
    return ''.join((                                                            
                str(ts) + '\t' + str(v) + '\r\n'                                  
                for ts, v in values                                             
            ))  

knowing that values is a list of approx 2400 tuples. f1() is the original code I found in a script written by a colleague more used to C/C++ that to Python at the time he wrote it, and f2 is IMHO the more Pythonic style I would have written for the same processing.

I would have expected that f2 being a lot faster that f1, mainly because f1 uses a lot of concatenations and string re-allocations but it occurs that %timeit gives the same magnitude order for both ones (approx. 18ns), and more surprisingly, giving f2 1ns faster sometimes 1ns.

What could be the explanation for such a result ?

[EDITED Jul 14] fixed f1 for the str override by the local variable with the same name. This error was not present in the profiled code however.

Eric PASCUAL
  • 109
  • 1
  • 8
  • @ayhan I did some search in SO before posting but could not find answers directly related to this example. Could you please tell me which questions this one is a duplicate of so that I could study their answers ? – Eric PASCUAL Jul 13 '17 at 16:10
  • I believe [user2357112's answer](https://stackoverflow.com/a/34008199/2285236) explains the issue. If you think there is anything left unexplained, I can reopen the question. – ayhan Jul 13 '17 at 16:19
  • Thanks for the reference, but the conclusions there seem to advocate for `+=` being less efficient than `join`, which is not the behavior reported by my tests. This contradiction (and not the "+= vs. join") was the point of my question . So I'm not sure that my question is a duplicate of the one you refer to. – Eric PASCUAL Jul 13 '17 at 16:44
  • Doesn't that "implementation detail" explain the contradiction? It is normally O(n^2) but resizing the string in-place might make it O(n) which is the complexity of join? – ayhan Jul 13 '17 at 16:48
  • Anyway, I am not an expert so I'll leave the decision to others. – ayhan Jul 13 '17 at 16:49
  • It's also worth mentioning the first code bit is not valid python. `str` is reassigned at the top and from that point on `str()` is invalid. – TemporalWolf Jul 13 '17 at 17:27
  • 1
    @TemporalWolf thanks for pointing me to this. My bad, wrong editing of the original code from which I have removed extra stuff before inclusion in the post for simplicity's sake. Anyway, the function I profiled with %timeit does not include this error. – Eric PASCUAL Jul 14 '17 at 14:10

3 Answers3

2

The f2 code is still bound by string concatenation due to

str(ts) + '\t' + str(v) + '\r\n'

The fact it is worse than the original version, also based on string concat, is likely due to the implementation detail mentioned in another question.

If you change the inner concatenations to also use join you would get better performance.

def f2(values):                                                                       
    return '\r\n'.join(
        ('\t'.join([str(ts), str(v)])
      for ts, v in values))
danny
  • 5,140
  • 1
  • 19
  • 31
  • 1
    The inner call to `join` is overkill for such a short list; just use `'{}\t{}'.format(ts, v)`, or overall `return '\r\n'.join(['{}\t{}'.format(*p) for p in values])`. – chepner Jul 13 '17 at 17:20
  • Personally, I'd probably use `return "\r\n".join("%s\t%s" % (a, b) for (a, b) in values)` as it's a bit more readable to me, but it's essentially the same form as "@chepner and should be comparable from a big-O POV. It does also error (`ValueError: too many values to unpack`) if an incorrectly sized tuple is used, whereas format(*p) for p in values will ignore the additional terms. – TemporalWolf Jul 13 '17 at 17:38
  • Using string interpolator gives results 5% longer than brute force concatenation. format method version is more or less the same speed as concatenation. However my problem is not about the inner item construction, but the overall result building, the join approach not giving the expected benefit over the "accumulated" concatenations in terms of performances. I'm surely missing something obvious here. – Eric PASCUAL Jul 14 '17 at 14:19
1

I can be fairly confident your test methodology is invalid, as can be demonstrated by repl.it for Py2.7 and repl.it for Py3. It's the same code, as shown below, but the results vary:

f1 is your f1 function
f2 is your f2 function
f3 is your f2 function using c-style string formatting "%s" % str
f4 is your f2 function using .format()

Results:

Python 2.7.10 (default, Jul 14 2015, 19:46:27)
[GCC 4.8.2] on linux

1.67547893524
1.33767485619
0.72606086731
1.32540607452

There are some differences, but in no case does f1 outperform any of the following methods.

Python 3.6.1 (default, Dec 2015, 13:05:11)
[GCC 4.8.2] on linux

3.0050943629757967
2.016791722999187
0.9476796620001551
1.9396837950043846

In both cases c style strings formatting is more than twice as fast.

The functions used:

def f1():
    s = ''
    for i in range(len(values)):
        s += str(values[i][0])
        s += '\t'
        s += str(values[i][1])
        s += '\r\n'
    return s

def f2():    
    return ''.join((                        
        str(ts) + '\t' + str(v) + '\r\n'                
        for ts, v in values))

def f3():           
    return ''.join((
        "%s\t%s\r\n" % (ts, v)  
        for ts, v in values))

def f4():
    return ''.join((
        "{}\t{}\r\n".format(ts, v)
        for ts, v in values))

Interestingly, by making a small change to your f1 function, we can attain a decent speed up by exploiting the bytecode speedup referenced by danny:

def f1opt():
    s = ''
    for i in range(len(values)):
        s += str(values[i][0]) + '\t' + str(values[i][1]) + '\r\n'
    return s

yields

Python 2.7.10 (default, Jul 14 2015, 19:46:27)
[GCC 4.8.2] on linux

f1()         1.68486714363
f1bytecode() 0.999644994736
TemporalWolf
  • 7,727
  • 1
  • 30
  • 50
  • Thanks TemporalWolf for taking time to investigate on your side. 2 questions about your comment : 1/ why are the backslahes doubled in your code ? Doesn't this lead to the literal '\t' (for instance) be added to the string instead of the tabulation as expected ? 2/ I don't understand the logic of f1opt : it seems to append either one part of the tuple or the other depending if we are on an odd or even item, but not both ones as wanted. Maybe this explains the gain in performances. – Eric PASCUAL Jul 14 '17 at 21:45
  • Understood the double '\\' : this comes from the fact that the code excerpt is extracted from the string version of the functions source, and not from a script. – Eric PASCUAL Jul 14 '17 at 21:54
  • @EricPASCUAL Yeah whoops, I'll pull those out :) – TemporalWolf Jul 14 '17 at 21:58
  • @EricPASCUAL I just tested it again in replit on [Py2.7](https://repl.it/J2db/1) & [Py3](https://repl.it/J2dg/0). Both work as expected. As the `if/else` [binds weakest](https://docs.python.org/2/reference/expressions.html#operator-precedence), you don't need any parentheses for it to figure it out. – TemporalWolf Jul 14 '17 at 22:01
  • There is something strange WRT f1 and f1opt, since when I execute them, they don't give the same result, f1opt "skipping" parts as the logic suggests : f1() first chars or result : 58059 153 57053 515 67898 879 72154 327 same for f1opt() : 58059 515 67898 327 95661 863 69801 548 – Eric PASCUAL Jul 15 '17 at 21:07
  • 1
    "I can be fairly confident your test methodology is invalid" : and you are right of course. I'm feeling a bit stupid now because the problem was that when working in iPython, I was not timing the various implementations, but instead... an exception raise caused by test data not being generated correctly. I'm really sorry for having polluted SO with a useless question. Just hoping the answers given by contributors could have brought some info to other readers WRT the performances of various code patterns. Thanks everybody for your patience. – Eric PASCUAL Jul 16 '17 at 21:43
0

Since the observed results were i bit surprising, I did the same profiling differently, using the following script :

import random
import timeit

data = [(random.randint(0, 100000), random.randint(0, 1000)) for _ in range(0, 2500)]

def f1():
    return ''.join(('{}\t{}\r\n'.format(ts, v) for ts, v in data))

def f2():
    s = ''
    for i in range(len(data)):
        s += str(data[i][0])
        s += '\t'
        s += str(data[i][1])
        s += '\r\n'

    return s


if __name__ == '__main__':
    repeat = 10000

    for f in ['f1', 'f2']:
        t = timeit.timeit(
            '%s()' % f, number=repeat, setup="from __main__ import %s" % f
        )
        print(
            "%s : avg time per loop = %f ms" % (f, t * 1000 / repeat)
        )

The output is now :

    f1 : avg time per loop = 0.779966 ms
    f2 : avg time per loop = 1.144340 ms

Which is more in line with the expected results.

I'll investigate more to understand the differences in behaviour between both tests.

Eric PASCUAL
  • 109
  • 1
  • 8
  • As detailed in a comment of previous answer, the problem resides in a bug in the testing method when experimenting with the algos in iPython. My bad :( – Eric PASCUAL Jul 16 '17 at 21:54