-3

Given two arbitrary strings:

a = 'start'
b = ' end'

When concatenated they produce 'start end'

What is the fastest way to concatenate the two strings?

VoidTwo
  • 569
  • 1
  • 7
  • 26

1 Answers1

2

Methods:

Method 1:

a += b

Method 2:

a + b

Method 3:

''.join((a, b))

Method 4:

'{0}{1}'.format(a, b)

Method 5:

f'{a}{b}'

Method 6:

'%s%s' % (a, b)

Test Program:

We can utilize the timeit library to test each method.

import timeit

setup: str = 'a = \'start\'\nb = \' end\''
tests: tuple = (
    'a += b',
    'a + b',
    '\'\'.join((a, b))',
    '\'{0}{1}\'.format(a, b)',
    'f\'{a}{b}\'',
    '\'%s%s\' % (a, b)')
number: int = 10000

print(f'Setup:\n{setup}')
times: list = []
for test in tests:
    print(f'\nTest: {test}')
    time: float = timeit.Timer(test, setup=setup).timeit(number=number)
    times.append(str(time))
    print(f'Time: {time}')
print('\n\nTimes Side by Side:\n{}'.format('\n'.join(times)))

With this example code, we run each method 10000 times and return the average time it takes.

Results:

Setup:
a = 'start'
b = ' end'

Test: a += b
Time: 0.0009844999999999993

Test: a + b
Time: 0.00042200000000003346

Test: ''.join((a, b))
Time: 0.0008056000000000174

Test: '{0}{1}'.format(a, b)
Time: 0.002130199999999971

Test: f'{a}{b}'
Time: 0.00047549999999996206

Test: '%s%s' % (a, b)
Time: 0.0015849000000000002


Times Side by Side:
0.0009844999999999993
0.00042200000000003346
0.0008056000000000174
0.002130199999999971
0.00047549999999996206
0.0015849000000000002

As you can see, Method 2, a + b is the fastest concatenation method.

VoidTwo
  • 569
  • 1
  • 7
  • 26
  • Looks like the last ten or so digits in your values are from [floating-point inaccuracies](https://stackoverflow.com/q/588004/1364007). Your answer would benefit if some rounding were applied to those numbers. – Wai Ha Lee Nov 24 '20 at 05:56
  • @WaiHaLee I am aware of floats being inaccurate. But I didn't want to make it seem like I was modifying the results in any fashion. – VoidTwo Nov 24 '20 at 05:59
  • The precision of the timer in your computer means that those last digits are meaningless anyway - running the same program again I'm sure you will get very different values beyond the first three or four significant figures. – Wai Ha Lee Nov 24 '20 at 06:03
  • 1
    The timing on `a += b` is also not useful; unlike the others, it's changing `a`, so each subsequent concatenation is operating on a longer `a` (honestly, I'm surprised it's as fast as it is, given that). In practice, the CPython interpreter uses the same optimization for `a += b` as it does for `a = a + b` (it's a fragile optimization though, so changing it to `a = a + b + c` will not invoke it, while `a += b + c` or `a = a + (b + c)` will), so the performance, in identical use cases would be roughly identical. – ShadowRanger Nov 30 '20 at 15:06
  • 1
    Interestingly, I can't reproduce your timings. On CPython 3.9, for Linux x64, the `+=` actually wins (barely), even with 10,000 repetitions causing it to grow to a multi-KB string; the CPython optimization for string concatenation means most of the concatenations just expand the used allocation without copying, making it actually cost less overall for concatenating with reassignment than concatenating without reassigning the `a` (which prevents the concatenation optimization from kicking in). – ShadowRanger Nov 30 '20 at 15:17