2

I have a big array with many subarrays inside and am trying to join all of the ones inside. I know how to concatenate arrays, but since the number of inner arrays varies, I don't know to make one function to concatenate these ones. I understand that i'll need one or possibly more loops, but am not sure how to do it. So far I have been doing it manually like this and keep going till i get to the last index:

ldata  = ldata[0]+ldata[1]+ldata[2]+ldata[3]+ldata[4]

where ldata is the bigger list, and all of the indexes are the inner lists. How do I do this?

Edit: Here is an example

a = [[1,2],[3,4],[5,6]]
Athreya Daniel
  • 101
  • 1
  • 10
  • Could you add an example? – Dani Mesejo Nov 09 '18 at 00:25
  • Look at `reduce()` or `functools.reduce()` (depending on Python version) – Michael Butscher Nov 09 '18 at 00:26
  • Thanks for all the answers! I would accept more if I could but Hochl's one was the simplest and I saw it first. – Athreya Daniel Nov 09 '18 at 00:42
  • @AthreyaDaniel: The superficial simplicity of `sum` hides a quadratic amount of copying happening under the hood. You say you have "a big array with many subarrays inside"; with a large amount of subarrays, you could literally spend hours to weeks waiting for `sum` when a more efficient implementation would finish in under a second. – user2357112 Nov 09 '18 at 00:55
  • I suggest you accept the other answer instead of mine. – hochl Nov 09 '18 at 11:19

4 Answers4

3

You could use chain.from_iterable:

from itertools import chain

a = [[0, 1], [2, 3], [4, 5], [6, 7]]
result = list(chain.from_iterable(a))

print(result)

Output

[0, 1, 2, 3, 4, 5, 6, 7]
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76
1

You can grab each sublist and add to a new list.

new_ldata = []
for sublist in ldata:
    new_ldata += sublist
0

If your lists are not too long, keep it simple:

>>> a
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> sum(a, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]

I made some timing measurements:

>>> timeit.timeit('sum([[1,2,3],[4,5,6],[7,8,9]], [])')
6.547808872535825
>>> timeit.timeit('reduce(lambda a, c: a + c, [[1,2,3],[4,5,6],[7,8,9]], [])', setup="from functools import reduce")
10.435796303674579

The more lists and the longer those lists, the solution with chain will perform much better:

a = [list(range(20)) for x in range(30)]
def test_sum():    
    return sum(a, [])
def test_chain():
    return list(itertools.chain.from_iterable(a))
def test_add():     
    result = []
    for i in a:
        result += i
    return result
def test_list_comprehension():
    return [x for l in a for x in l]
print(timeit.timeit(test_sum), timeit.timeit(test_chain), timeit.timeit(test_add), timeit.timeit(test_list_comprehension))

yields

18.778313734044787 7.5882537689758465 2.5082976589910686 13.912770285038278

That shows that adding up the arrays with a short function is pretty good too.

hochl
  • 12,524
  • 10
  • 53
  • 87
  • 1
    Nice! Although the documentation states "To concatenate a series of iterables, consider using itertools.chain()" (https://docs.python.org/3/library/functions.html#sum) – slider Nov 09 '18 at 00:35
  • Thank you, this was a simple and effective fix! – Athreya Daniel Nov 09 '18 at 00:37
  • 1
    "why use a module if the builtin works equally well" - because it doesn't. Try to join a million lists with `itertools.chain`, and it won't even take a second. Try to do that with `sum`, and you'll have to come back in a few hours. – user2357112 Nov 09 '18 at 00:51
  • 1
    Also see https://stackoverflow.com/questions/41772054/why-sum-on-lists-is-sometimes-faster-than-itertools-chain – user2357112 Nov 09 '18 at 00:52
  • actually it kicks in much sooner, I didn't anticipate that sum was so bad ... – hochl Nov 09 '18 at 11:18
0

You can use numpy concatenate for this

import numpy as np
x = [[1,1],[2,2,2],[3],[4,4,4,4]]
concated_x = np.concatenate(x) # now in numpy array form
concated_x = list(concated_x) # if you want it back to a list form
James Fulton
  • 322
  • 2
  • 8