1

I have a data set that looks like this:

data = [ [[a,b],[a,d]], [[e,f],[g,h]], [[i,j],[k,j]] ]

And I want to unzip it so I have:

[[a,a], [e,g], [i,k]] and [[b,d], [f,h], [j,j]]

Along the same line, is there a way to get the length of list, without counting duplicates based on one value? For example, using the first list above, I want to count the number of lists in each sublist, without counting duplicates in the second value. So I want to get:

[2, 2, 1]

I'm able to get a result of [2, 2, 2] using:

count = [len(i) for i in data]

but since I can't separate the values, there is no way to check for duplicates in the second value alone.

j2120
  • 45
  • 1
  • 6
  • are you trying to flatten your list and count the occurrences? http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python – user1767754 Oct 20 '15 at 18:32

3 Answers3

2
>>> d = [ [[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]] ]
>>> list(zip(*[list(zip(*x)) for x in d]))
[((1, 3), (5, 7), (9, 11)), ((2, 4), (6, 8), (10, 12))]

Or with your example:

>>> d = [[['a', 'b'], ['a', 'd']], [['e', 'f'], ['g', 'h']], [['i', 'j'], ['k', 'j']]]
>>> list(zip(*[list(zip(*x)) for x in d]))
[(('a', 'a'), ('e', 'g'), ('i', 'k')), (('b', 'd'), ('f', 'h'), ('j', 'j'))]

As for your counting, since you only want to look at the second values, you can just filter those out, create a set of them to get rid of duplicate values and then count them:

>>> [len(set(x[1] for x in y)) for y in d]
[2, 2, 1]
poke
  • 369,085
  • 72
  • 557
  • 602
  • I basically want to count the number of lists inside each list. Right now, each list has two lists inside it (ex,[ [1,2] ,[3,4] ] would give 2). However, I would like duplicates in the second value to not be counted. (ex. [ [1,2], [3,2] ] would give 1). Also...I understand what should be happening in the code, but i'm getting an error "ValueError: need more than 0 values to unpack" – j2120 Oct 20 '15 at 18:53
0

To transpose your sublists:

data = [ [["a","b"],["a","d"]], [["e","f"],["g","h"]], [["i","j"],["k","j"]] ]

a,b = (map(list,zip(*(map(list, zip(*sub)) for sub in data))))

print(a,b)
[['a', 'a'], ['e', 'g'], ['i', 'k']] [['b', 'd'], ['f', 'h'], ['j', 'j']]

To get the counts you could use a set:

print([len(set(map(itemgetter(1), sub)) )for sub in data])

[2, 2, 1]

A set won't work for more than two sublists i.e:

data = [[["a", "b"], ["a", "d"]], [["e", "f"], ["g", "h"]], [["i", "j"], ["k", "j"], ["A", "K"], ["B", "K"]]]

from collections import Counter
from operator import itemgetter

print([sum(v == 1 for v in Counter(map(itemgetter(1), sub)).values()) for sub in data])
[2, 2, 0]

If you used a set with the last data you would get [2, 2, 2] which I imagine would be wrong as there are no unique values

Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
0

This answer doesn't use map or list comprehension but is straightforward with just for loops.

data = [ [[a,b],[a,d]], [[e,f],[g,h]], [[i,j],[k,j]] ]

zip0 = []
zip1 = []
sub0=[]
sub1=[]

for x in data:
    for y in x:
        sub0.append(y[0])
        sub1.append(y[1])
    zip0.append(sub0)
    zip1.append(sub1)
    sub0 = []
    sub1 = []

print zip0
print zip1
jdf
  • 699
  • 11
  • 21