0

I am trying to perform multiple operations on each element in a large list (>10000 elements). For example, my list L1 has x,y,z coordinates

L1 = [[1.23,4.55,5.66],[3.23,-8.55,3.66],[5.73,2.35,55.16]]

I wish to convert each element into a single string concatenated by single decimal floats of each of the three points. So for the above list, I wish to create a new list L2

L2 = ['1.24.65.7','3.2-8.63.7','5.72.455.2']

I tried the following two obvious methods using simple for loop and list comprehension. Both the methods took more than 8 minutes to run. I am posting this question to inquire about a much faster approach.

#Method1
final = []
for point in points:
     x,y,z = point[0],point[1],point[2]

     final.append(str(round(x,1))+str(round(y,1))+str(round(z,1)))

#Method2
final = [str(round(i[0],1))+str(round(i[1],1))+str(round(i[2],1)) for i in points]

5 Answers5

2

Maybe string formatting will be faster.

final = ["%.1f%.1f%.1f" % tuple(i) for i in points]

or f-strings:

final = [f"{x:.1f}{y:.1f}{z:.1f}" for x, y, z in points]
Barmar
  • 741,623
  • 53
  • 500
  • 612
2

You can also compare the method using timeit. This is a good point to quickly get an idea of the efficiency.

Here,just a summary between your code and the answers:

# Import timeit
import timeit

L1 = [[1.23, 4.55, 5.66], [3.23, -8.55, 3.66], [5.73, 2.35, 55.16]]

# Lengthen the list
L1 = L1 * 1000


def method_1(L1):
    def func():
        final = []
        for point in L1:
            x, y, z = point[0], point[1], point[2]

            final.append(str(round(x, 1))+str(round(y, 1))+str(round(z, 1)))
    return func


def method_2(L1):
    def func():
        final = [str(round(i[0], 1))+str(round(i[1], 1)) +
                 str(round(i[2], 1)) for i in L1]
    return func


def sol_1_1(L1):
    def func():
        final = ["%.1f%.1f%.1f" % (x, y, z) for x, y, z in L1]
    return func


def sol_1_2(L1):
    def func():
        final = [f"{x:.1f}{y:.1f}{z:.1f}" for x, y, z in L1]
    return func


def sol_2(L1):
    def func():
        final = [''.join(map(str, (round(e, 1) for e in l))) for l in L1]
    return func


t = timeit.Timer(method_1(L1))
print("Method 1: ", t.timeit(50))

t = timeit.Timer(method_2(L1))
print("Method 2: ", t.timeit(50))

t = timeit.Timer(sol_1_1(L1))
print("Answer 1_1: ", t.timeit(50))

t = timeit.Timer(sol_1_2(L1))
print("Answer 1_2: ", t.timeit(50))

t = timeit.Timer(sol_2(L1))
print("Answer 2: ", t.timeit(50))

Output:

Method 1:  0.5920865
Method 2:  0.6394685
Answer 1_1:  0.15333640000000015
Answer 1_2:  0.20070460000000034
Answer 2:  0.6677959000000002

So the results let think the solution provided by @Barmar is the fastest. Hope that might help you latter !

Alexandre B.
  • 5,387
  • 2
  • 17
  • 40
1

Just round the digits, convert them to string and then join them together

>>> L1 = [[1.23,4.55,5.66],[3.23,-8.55,3.66],[5.73,2.35,55.16]]
>>> L2 = [''.join(map(str, (round(e, 1) for e in l))) for l in L1]
>>> print (L2)
['1.24.55.7', '3.2-8.63.7', '5.72.455.2']
Sunitha
  • 11,777
  • 2
  • 20
  • 23
1

Using itertools.starmap together with str.format is a little bit faster, but not by much.

from itertools import starmap
result = [*starmap(("{:.1f}"*3).format,l)]

Following is the test code for different methods (in Jupyter Notebook).

from itertools import starmap
l = [[1.23,4.55,5.66],[3.23,-8.55,3.66],[5.73,2.35,55.16]] * 10000

%timeit [''.join(map(str, (round(e, 1) for e in sl))) for sl in l]
%timeit ["%.1f%.1f%.1f" % tuple(i) for i in l]
%timeit [f"{x:.1f}{y:.1f}{z:.1f}" for x, y, z in l]
%timeit [("{:.1f}"*3).format(*i) for i in l]
%timeit f=("{:.1f}"*3).format;[f(*i) for i in l]
%timeit [*starmap(("{:.1f}"*3).format,l)]

Output:

116 ms ± 758 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
27.5 ms ± 190 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
37.6 ms ± 236 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
28 ms ± 379 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
27 ms ± 426 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
25.7 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
GZ0
  • 4,055
  • 1
  • 10
  • 21
  • you should use `list(starmap(...))` i bet it's faster than using unpacking `[*starmap(...)]` – juanpa.arrivillaga Aug 02 '19 at 23:57
  • Explicit `list()` calls are slightly slower than `[*...]` in all my experiments before (maybe partially due to the overhead of looking up the global `list` function). It is the same for this one. – GZ0 Aug 03 '19 at 00:02
  • Almost certainly not. in any case, you aren't using a list comprehension. – juanpa.arrivillaga Aug 03 '19 at 00:05
  • I mean unpacking `[*...]`, not list comprehensions. I corrected the comment above. Do you have a test case where unpacking is slower? – GZ0 Aug 03 '19 at 00:08
  • anyway, I'm getting `list` as a clear winner: `timeit.timeit('list(x for x in range(10000))', number=1000)` vs `timeit.timeit('[x for x in (y for y in range(10000))]', number=1000)`, but actually the list-literal unpacking seems to be about the same as or slightly faster `list`, `timeit.timeit('[*(y for y in range(10000))]', number=1000)` – juanpa.arrivillaga Aug 03 '19 at 00:09
  • The difference between unpacking and `list` calls is insignificant in most cases. You can only see it more clearly when the number of items is relatively small (e.g. `list(range(50))` vs. `[*range(50)]`). – GZ0 Aug 03 '19 at 00:19
  • If you look at the disassembled code of `[*a]` you will see the unpacking is only one instruction. For `list(a)` one extra instruction is needed to look up the `list` function. – GZ0 Aug 03 '19 at 00:22
-1

Also, try pre-allocating the list:

final = [None] * len(points)

and, then instead of doing .append(..), set it at the relevant index.

PS: This is more of a hunch. It might or might not help you.

UltraInstinct
  • 43,308
  • 12
  • 81
  • 104
  • I wonder if list comprehension is smart enough to preallocate the result list if there are no conditionals. – Barmar Aug 02 '19 at 22:54
  • I was thinking just that right now. Hopefully the OP makes the relevant changes and tests the theory. I'm looking forward to the results :) – UltraInstinct Aug 02 '19 at 22:55
  • https://stackoverflow.com/a/40018719/1491895 says that list comprehensions can't determine the size of the result. – Barmar Aug 02 '19 at 22:58
  • This is definitely not the bottleneck, even with .append growing a list is linear time in Python, and for something on the order 10s or even 100s of thousands of elements will be a tiny fraction of 8 minutes – juanpa.arrivillaga Aug 02 '19 at 23:50
  • @Barmar correct, the list comprehension uses `.append` underneath the hood, although, it is marginally faster because it doesn't have to resolve the `some_list.append` attribute, attribute resolution being expensive in python. in any case, pre-allocating a `list` in python is generally not a good way to squeeze more performance out, *on average*, `some_list.append(some_value)` and `some_list[i] = some_value` will not be much different in terms of time. – juanpa.arrivillaga Aug 02 '19 at 23:52
  • @Barmar's idea worked very well. Runtime reduced from 8 minutes to under 30 seconds. – pythonprotein Aug 03 '19 at 19:00