Extract a column from a list of rows with Python

Question

I have the following dataset, which is a series of rows stored as nested lists:

[['John', '35', 'UK'],
['Emma', '43', 'UK'],
['Lucy', '25', 'AU']]

(rows are always the same length)

I need to return 'UK', 'AU' as an iterable (indifferent to ordering).

Is there a one-liner that returns the unique values contained in the third column, and which is simpler than this?

set(list(map(list, zip(*l)))[2])

(Ref: Transpose list of lists)

The6thSense · Accepted Answer · 2015-09-23T12:39:59.160

5

Change in you own code:

Python 3.x:

set(list(zip(*l)[2]))

Python 2.x:

set(zip(*l)[2])

Demo:

l=[['John', '35', 'UK'],['Emma', '43', 'UK'],['Lucy', '25', 'AU']]
set(list(zip(*l)[2]))
{'AU', 'UK'}

edited Sep 23 '15 at 12:39

answered Sep 23 '15 at 10:35

The6thSense

8,103
8
31
65

Right! I tried that initially, but you need to convert zip's output to either list or tuple for each to be subscriptable. Otherwise it throws an exception. – bsuire Sep 23 '15 at 11:53
@bsuire which python version are you using – The6thSense Sep 23 '15 at 11:54
3. I didn't thought that would be of importance here, but I guess zip and map behave slightly differently depending on the version of Python. – bsuire Sep 23 '15 at 11:58
@bsuire yes :) they are in python 2.x `set(zip(*l)[2])` will work – The6thSense Sep 23 '15 at 12:39

score 4 · Answer 2 · answered Sep 23 '15 at 10:37

4

You can use list comprehension:

>>> L = [['John', '35', 'UK'],
['Emma', '43', 'UK'],
['Lucy', '25', 'AU']]
>>> set([i[2] for i in L])
set(['AU', 'UK'])

answered Sep 23 '15 at 10:37

Joe T. Boka

6,554
6
29
48

score 3 · Answer 3 · answered Sep 23 '15 at 10:32

3

>>> l = [['John', '35', 'UK'],
         ['Emma', '43', 'UK'],
         ['Lucy', '25', 'AU']]
>>> set(element[-1] for element in l)
('AU', 'UK')

answered Sep 23 '15 at 10:32

Christian Witts

11,375
1
33
46

score 1 · Answer 4 · answered Sep 23 '15 at 10:32

1

You can use numpy:

import numpy as np

arr = np.array([['John', '35', 'UK'],
                ['Emma', '43', 'UK'],
               ['Lucy', '25', 'AU']])

unique = np.unique(arr[:,2])

answered Sep 23 '15 at 10:32

areuexperienced

1,991
2
17
27

score 1 · Answer 5 · answered Sep 23 '15 at 10:43

1

I think the actually requirements of bsuire are more complicated in practice, so I recommend to use pandas to process such requirements, it's more powerful and flexible.

so, how to use pandas in this case:

In [17]: import pandas as pd

In [18]: a = [['John', '35', 'UK'],
   ....: ['Emma', '43', 'UK'],
   ....: ['Lucy', '25', 'AU']]

In [19]: b = pd.DataFrame(a)

In [20]: b
Out[20]:
      0   1   2
0  John  35  UK
1  Emma  43  UK
2  Lucy  25  AU

In [21]: b[2].unique()
Out[21]: array(['UK', 'AU'], dtype=object)

In [22]:

answered Sep 23 '15 at 10:43

taotao.li

1,050
2
11
23

thanks! I should really look into Panda to gain in flexibility for manipulating data. Is Panda also the go to Python module for performing SQL-like aggregation queries? – bsuire Sep 23 '15 at 12:06
for one, pandas support read sql query; for another, pandas really support sql-like aggregation queries, check here: http://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html , trust me, if you use python do do data science related work, ipython, pandas, numpy, scipy, matplotlib, seaborn, mpld3. etc are the best arms you need – taotao.li Sep 24 '15 at 08:48

Extract a column from a list of rows with Python

5 Answers5