110

With Sqlite, a select .. from command returns the results output, which prints:

>>print output
[(12.2817, 12.2817), (0, 0), (8.52, 8.52)]

It seems to be a list of tuples. I would like to either convert output to a simple list:

[12.2817, 12.2817, 0, 0, 8.52, 8.52]

or a 2x3 matrix:

12.2817 12.2817
0          0 
8.52     8.52

to be read via output[i][j]

The flatten command does not do the job for the 1st option, and I have no idea for the second one...

A fast solution would be appreciated, as the real data is much bigger.

Neuron
  • 5,141
  • 5
  • 38
  • 59
garth
  • 1,149
  • 2
  • 10
  • 5
  • 2
    `[(12.2817, 12.2817), (0, 0), (8.52, 8.52)]` is already a 3x2 matrix !? or did i miss something ? – mouad May 17 '12 at 09:24
  • See [this question](http://stackoverflow.com/questions/2158395/flatten-an-irregular-list-of-lists-in-python) – Joel Cornett May 17 '12 at 09:26
  • 1
    for the flatten function check itertools module recipes there is already a flatten function example: http://docs.python.org/library/itertools.html#recipes – mouad May 17 '12 at 09:30
  • 4
    [`[item for sublist in output for item in sublist]`](http://stackoverflow.com/a/952952/1243951) works perfectly and has the advantage that your inner tuples could also be lists; more generally any combination of inner and outer iterable works – Kyss Tao May 05 '13 at 18:59

11 Answers11

155

By far the fastest (and shortest) solution posted:

list(sum(output, ()))

About 50% faster than the itertools solution, and about 70% faster than the map solution.

Joel Cornett
  • 24,192
  • 9
  • 66
  • 88
  • 11
    @Joel nice, but I wonder how it works? `list(output[0]+output[1]+output[2])` gives the desired result but `list(sum(output))` not. Why? What "magic" does the () do? – Kyss Tao May 05 '13 at 18:39
  • 10
    Ok, I should have read the manual *g*. It seems `sum(sequence[, start])`: sum adds `start` which defaults to `0` rather then just starting from `sequence[0]` if it exists and then adding the rest of the elements. Sorry for bothering you. – Kyss Tao May 05 '13 at 18:44
  • 4
    This is a well-known anti-pattern: don't use `sum` to concatenate sequences, it results in a quadratic time algorithm. Indeed, the `sum` function will complain if you try to do this with strings! – juanpa.arrivillaga Apr 11 '18 at 22:42
  • @juanpa.arrivillaga: agreed. There are very few use cases where this would be preferable. – Joel Cornett Apr 11 '18 at 23:19
  • 18
    Yes, fast but completely obtuse. You'd have to leave a comment as to what it is actually doing :( – CpILL May 22 '18 at 21:30
  • See my answer for a comparison of this to other techniques which are faster and more Pythonic IMO. – Gman Feb 11 '19 at 21:52
72

List comprehension approach that works with Iterable types and is faster than other methods shown here.

flattened = [item for sublist in l for item in sublist]

l is the list to flatten (called output in the OP's case)


timeit tests:

l = list(zip(range(99), range(99)))  # list of tuples to flatten

List comprehension

[item for sublist in l for item in sublist]

timeit result = 7.67 µs ± 129 ns per loop

List extend() method

flattened = []
list(flattened.extend(item) for item in l)

timeit result = 11 µs ± 433 ns per loop

sum()

list(sum(l, ()))

timeit result = 24.2 µs ± 269 ns per loop

Gman
  • 761
  • 5
  • 6
  • 2
    I had to use this on a large dataset, the list comprehension method was by far the fastest! – nbeuchat Aug 16 '18 at 06:36
  • I did a little change to the .extend solution and now performs a bit better. check it out on your timeit to compare – Totoro Nov 20 '18 at 17:00
  • this is very confusing and I don't understand the syntax here at all. General syntax for list comprehension is `expression for item in list` like `x*2 for x in listONumbers`. So for flattening you would expect an expression like `num for num in sublist for sublist in list` not `num for sublist in list for num in sublist`. how is in the comprehension broken down here? – Cheruvim Mar 10 '23 at 01:06
31

In Python 2.7, and all versions of Python3, you can use itertools.chain to flatten a list of iterables. Either with the * syntax or the class method.

>>> t = [ (1,2), (3,4), (5,6) ]
>>> t
[(1, 2), (3, 4), (5, 6)]
>>> import itertools
>>> list(itertools.chain(*t))
[1, 2, 3, 4, 5, 6]
>>> list(itertools.chain.from_iterable(t))
[1, 2, 3, 4, 5, 6]
Thruston
  • 1,467
  • 17
  • 23
16

Update: Flattening using extend but without comprehension and without using list as iterator (fastest)

After checking the next answer to this that provided a faster solution via a list comprehension with dual for I did a little tweak and now it performs better, first the execution of list(...) was dragging a big percentage of time, then changing a list comprehension for a simple loop shaved a bit more as well.

The new solution is:

l = []
for row in output: l.extend(row)

The old one replacing list with [] (a bit slower but not much):

[l.extend(row) for row in output]

Older (slower):

Flattening with list comprehension

l = []
list(l.extend(row) for row in output)

some timeits for new extend and the improvement gotten by just removing list(...) for [...]:

import timeit
t = timeit.timeit
o = "output=list(zip(range(1000000000), range(10000000))); l=[]"
steps_ext = "for row in output: l.extend(row)"
steps_ext_old = "list(l.extend(row) for row in output)"
steps_ext_remove_list = "[l.extend(row) for row in output]"
steps_com = "[item for sublist in output for item in sublist]"

print(f"{steps_ext}\n>>>{t(steps_ext, setup=o, number=10)}")
print(f"{steps_ext_remove_list}\n>>>{t(steps_ext_remove_list, setup=o, number=10)}")
print(f"{steps_com}\n>>>{t(steps_com, setup=o, number=10)}")
print(f"{steps_ext_old}\n>>>{t(steps_ext_old, setup=o, number=10)}")

Time it results:

for row in output: l.extend(row)                  
>>> 7.022608777000187

[l.extend(row) for row in output]
>>> 9.155910597999991

[item for sublist in output for item in sublist]
>>> 9.920002304000036

list(l.extend(row) for row in output)
>>> 10.703829122000116
Totoro
  • 867
  • 9
  • 10
9
>>> flat_list = []
>>> nested_list = [(1, 2, 4), (0, 9)]
>>> for a_tuple in nested_list:
...     flat_list.extend(list(a_tuple))
... 
>>> flat_list
[1, 2, 4, 0, 9]
>>> 

you could easily move from list of tuple to single list as shown above.

cobie
  • 7,023
  • 11
  • 38
  • 60
9

use itertools chain:

>>> import itertools
>>> list(itertools.chain.from_iterable([(12.2817, 12.2817), (0, 0), (8.52, 8.52)]))
[12.2817, 12.2817, 0, 0, 8.52, 8.52]
Charles Beattie
  • 5,739
  • 1
  • 29
  • 32
7

Or you can flatten the list like this:

reduce(lambda x,y:x+y, map(list, output))
Maria Zverina
  • 10,863
  • 3
  • 44
  • 61
  • `reduce(lambda x,y:x+y, output)` seems to work directly converting to a long tuple (which can be converted to a list). Why use `map(list, output)` inside the `reduce()` call? Maybe It's more in line with the fact that [tuples are immutable, lists are mutable](https://stackoverflow.com/a/626871/2641825). – Paul Rougieux Mar 20 '19 at 14:47
5

This is what numpy was made for, both from a data structures, as well as speed perspective.

import numpy as np

output = [(12.2817, 12.2817), (0, 0), (8.52, 8.52)]
output_ary = np.array(output)   # this is your matrix 
output_vec = output_ary.ravel() # this is your 1d-array
Joshua Cook
  • 12,495
  • 2
  • 35
  • 31
3

In case of arbitrary nested lists(just in case):

def flatten(lst):
    result = []
    for element in lst: 
        if hasattr(element, '__iter__'):
            result.extend(flatten(element))
        else:
            result.append(element)
    return result

>>> flatten(output)
[12.2817, 12.2817, 0, 0, 8.52, 8.52]
cval
  • 6,609
  • 2
  • 17
  • 14
3
def flatten_tuple_list(tuples):
    return list(sum(tuples, ()))


tuples = [(5, 6), (6, 7, 8, 9), (3,)]
print(flatten_tuple_list(tuples))
Neuron
  • 5,141
  • 5
  • 38
  • 59
  • 5
    Thank you for contributing an answer. Would you kindly edit your answer to to include an explanation of your code? That will help future readers better understand what is going on, and especially those members of the community who are new to the language and struggling to understand the concepts. That’s especially useful here, where your answer is competing for attention with nine other answers. What distinguishes yours? When might this be preferred over well-established answers above? – Jeremy Caney Feb 06 '21 at 01:02
  • 1
    Ok sure I will do that – SATYAM TRIPATHI Feb 07 '21 at 04:54
1

The question mentions that the list of tuples (output) is returned by Sqlite select .. from command.

Instead of flattening the returned output, you could adjust how sqlite connection returns rows using row_factory to return a matrix (list of lists/nested lists) with numeric values instead of a list with tuples:

import sqlite3 as db

conn = db.connect('...')
conn.row_factory = lambda cursor, row: list(row) # This will convert the tuple to list.
c = conn.cursor()
output = c.execute('SELECT ... FROM ...').fetchall()
print(output)
# Should print [[12.2817, 12.2817], [0, 0], [8.52, 8.52]]
nataliamo
  • 66
  • 6