4

I've got the following bruteforce option that allows me to iterate over points:

# [x1, y1, x2, y2, ..., xn, yn]
coords = [1, 1, 2, 2, 3, 3]
# The goal is to operate with (x, y) within for loop
for (x, y) in zip(coords[::2], coords[1::2]):
   # do something with (x, y) as a point

Is there a more concise / efficient way to do it?

Alex
  • 579
  • 1
  • 6
  • 14
  • 1
    When you say *"brute force"* are you thinking there might be a way to achieve your aims *without* iterating over all of the pairs? What's the *context*? Is there a specific problem with the code you've posted? In terms of efficiency, [`itertools.islice`](https://docs.python.org/3/library/itertools.html#itertools.islice) might be better than creating two new lists, but that's maybe *less* concise. – jonrsharpe Apr 03 '19 at 08:49

2 Answers2

2

(coords -> items)

Short Answer

If you want your items grouped with a specific length of 2, then

zip(items[::2], items[1::2])

is one of the best compromise in terms of speed and clarity. If you can afford an extra line, you can get a bit (lot -- for larger inputs) more efficient by using iterators:

it = iter(items)
zip(it, it)

Long Answer

(EDIT: added a method that avoids zip())

You could achieve this in a number of ways. For convenience, I write those as functions that can be benchmarked. Also I will leave the size of the group as a parameter n (which, in your case, is 2)

def grouping1(items, n=2):
    return zip(*tuple(items[i::n] for i in range(n)))


def grouping2(items, n=2):
    return zip(*tuple(itertools.islice(items, i, None, n) for i in range(n)))


def grouping3(items, n=2):
    for j in range(len(items) // n):
        yield items[j:j + n]


def grouping4(items, n=2):
    return zip(*([iter(items)] * n))


def grouping5(items, n=2):
    it = iter(items) 
    while True: 
        result = [] 
        for _ in range(n): 
            try: 
                tmp = next(it) 
            except StopIteration: 
                break 
            else: 
                result.append(tmp) 
        if len(result) == n: 
            yield result 
        else: 
            break

Benchmarking these with a relatively short list gives:

short = list(range(10))

%timeit [x for x in grouping1(short)]
# 1.33 µs ± 9.82 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit [x for x in grouping2(short)]
# 1.51 µs ± 16.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit [x for x in grouping3(short)]
# 1.14 µs ± 28.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit [x for x in grouping4(short)]
# 639 ns ± 7.56 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit [x for x in grouping5(short)]
# 3.37 µs ± 16.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

For medium sized inputs:

medium = list(range(1000))

%timeit [x for x in grouping1(medium)]
# 21.9 µs ± 466 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit [x for x in grouping2(medium)]
# 25.2 µs ± 257 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit [x for x in grouping3(medium)]
# 65.6 µs ± 233 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit [x for x in grouping4(medium)]
# 18.3 µs ± 114 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit [x for x in grouping5(medium)]
# 257 µs ± 2.88 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

For larger inputs:

large = list(range(1000000))

%timeit [x for x in grouping1(large)]
# 49.7 ms ± 840 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit [x for x in grouping2(large)]
# 37.5 ms ± 42.4 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit [x for x in grouping3(large)]
# 84.4 ms ± 736 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit [x for x in grouping4(large)]
# 31.6 ms ± 85.7 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit [x for x in grouping5(large)]
# 274 ms ± 2.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

As far as efficiency, grouping4() seems to be the fastest, closely followed by grouping1() or grouping3() (depending on the size of the input).

In your case, grouping1() seems a good compromise between speed and clearness, unless you are willing to wrap it up in a function.

Note that grouping4() requires you to use the same iterator multiple times and:

zip(iter(items), iter(items))

would NOT work.

If you want more control over uneven grouping i.e. when the len(items) is not divisible by n, you could replace zip with itertools.zip_longest() from the standard library.

Note also that grouping4() is substantially the grouper() recipe from the itertools official documentation.

norok2
  • 25,683
  • 4
  • 73
  • 99
-1

You can use iter(object) and next(iterator, default) with a known default to leave your loop:

coords = [1, 1, 2, 2, 3, 3]
it = iter(coords)

while it:
    x = next(it, None)
    y = next(it, None)
    if x is None or y is None:
        break

    # do something with your pairs
    print(x,y) 

Output:

1 1
2 2
3 3
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
  • 1
    ...and is this more concise and/or more efficient? – norok2 Apr 03 '19 at 10:30
  • @norok It is more ressource friendly - you iterate the same list - you do not create two intermediate lists just to zip them - so it _is_ better in that regard. Might be important if your lists are rather big (100k+) or the list items rather expensive to construct/copy (not the case for integers) – Patrick Artner Apr 03 '19 at 14:25
  • I do not think that `zip()` iteself creates / require two intermediate lists... also: slicing will not create copies of the list, see [here](https://stackoverflow.com/questions/5131538/slicing-a-list-in-python-without-generating-a-copy) – norok2 Apr 03 '19 at 14:44
  • @norok slicing will make copies of the references to the integers - which take _8 bytes on a 64-bit machine_ (from your link)-. I am not sure how big an iterator is, but it is essently a pointer - so it should be far smaller then all those copied references. – Patrick Artner Apr 03 '19 at 14:52
  • 1
    yes, but then a zip of the same iterator is a much better than this approach with explicit and slow looping... see `grouping5` from my answer – norok2 Apr 03 '19 at 17:08