2

I would like to split an int list l in two small lists l1, l2 (I know the split point n).

I am already able to perform the splitting by copying l2 elements in another list and then removing them from l, but this requires to have space for at least n + n/2 elements in memory and this is not affordable since l is big.

Does someone has a solution?

aretor
  • 2,379
  • 2
  • 22
  • 38

5 Answers5

1

You can slice the list with itertools.islice at n. The slice objects are lazy and are only loaded into memory when you iterate on them:

from itertools import islice

def split_list(lst, n):
    return islice(lst, n), islice(lst, n, None)

A, B = split_list(range(10), 5)

print(list(A))
# [0, 1, 2, 3, 4]
print(list(B))
# [5, 6, 7, 8, 9]
Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
1

If you do not want to spend additional memory on the smaller lists, you have two possibilities:

  1. Either you can destroy/reduce the original list as you create the smaller lists. You could use collections.deque, providing O(1) removal and insertion at both ends:

    >>> from collections import deque
    >>> deq = deque(range(20))
    >>> front = deque(deq.popleft() for _ in range(10))
    >>> front
    deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
    >>> deq  # original list reduced, can be used as 2nd list
    deque([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
    
  2. Or you can create two views on parts of the smaller lists, meaning that the original list would be altered if the smaller lists are modified, and vice versa. For instance, use numpy.array for your numbers and create slices (being views) on that array:

    >>> import numpy as np
    >>> arr = np.array(range(20))
    >>> front = arr[:10]
    >>> back = arr[10:]
    >>> front[3] = 100
    >>> arr  # original list modified
    array([  0,   1,   2, 100,   4,   5,   6,   7,   8,   9,  10,  11,  12,
            13,  14,  15,  16,  17,  18,  19])
    

If you have to use plain Python list, you could also use list.pop. However, as explained in the documentation for deque, you should not use pop(0), as this will have to re-organize the entire list each time you pop an element, giving you O(n²) for extracting half of the list. Instead, use pop() to pop from the end of the list. To restore the original order, you could first pop into a temporary list, and then pop from that list, reversing it twice.

>>> lst = list(range(10))
>>> tmp = [lst.pop() for _ in range(5)]
>>> front, back = lst, [tmp.pop() for _ in range(len(tmp))]
>>> front, back
([0, 1, 2, 3, 4], [5, 6, 7, 8, 9])
tobias_k
  • 81,265
  • 12
  • 120
  • 179
0

How about trying this

class mylist(list):
    def split_by_position_n(self, n):
        return self[:int(n)], self[int(n):]

then, l = mylist(range(1,10))

l.split_by_position_n(4)
>>>([1, 2, 3, 4], [5, 6, 7, 8, 9])
Vatsal Parekh
  • 248
  • 4
  • 12
0

I came with a simple solution. What about

l1 = [l.pop(0) for i in range(n)]

It seems to work and it should not require additional memory.

aretor
  • 2,379
  • 2
  • 22
  • 38
  • This is similar to my first solution, but note that on regular lists, `pop(0)` is _very_ wasteful. You do not need additional memory, but with each `pop`, all the elements have to be repositioned. Better use a `deque`. In fact, the [documentation on `deque`](https://docs.python.org/3/library/collections.html#collections.deque) even mentions O(n) _memory_ cost for `pop(0)` on lists, which is exactly what you are trying to avoid. – tobias_k Feb 08 '17 at 20:00
  • (Misread the documentation, it's O(n) cost for moving memory, not O(n) memory cost, but still quadratic complexity for popping half the list, which is quite bad if the list is long.) – tobias_k Feb 08 '17 at 20:09
  • @tobias_k what about `pop()` only? Does it require O(n) cost for moving elements? I don't really need to keep the elements ordered. – aretor Feb 08 '17 at 20:57
  • `pop()` should not be as bad; it will probably still do some under-the-hood data reorganization from time to time as the list shrinks, and the elements in `li` will be reversed, but otherwise... to be sure you'll just have to try with your big list. Still, what's wrong with using a `deque`? – tobias_k Feb 08 '17 at 21:27
  • well `deque` is not `list` or a subclass of `list` I would be bound to use `deque` class without all the features of the `list` class, or do something like `deq = deque(l)` and then `l = list(deq)`. Would it be helpful to put my option as the 3rd choice in your answer, along with these considerations? – aretor Feb 09 '17 at 13:34
-1

Helo AreTor,

This is my solution:

list = [1,2,3,4,5,6,7,8,9]
l1 = list[0:n]
l2 = list[n:]
  • list is the list of elements.
  • n is the number of elements you want in the first list (the split point index).

list[0:n] will return the first n items of you list.

list[n:] will return the rest of the list.

Hope it helps you

EDIT: If you worry about memory problems, you can delete the list once you split it.. Even you can use list[0:n] / list[n:] in your code and you dont use more memory..

Ika8
  • 391
  • 1
  • 12