2

I wish to see whether values x exists in values y of a list, if they do, I want to print out the total number of such values.

z=0
for x,y in zip(labels,n):
    if x in y:
        z=z+1
print(z)

This is what labels looks like:

[['20011', '20048'],
 ['20011', '20048'],
 ['20011', '20048'],
 ['20011', '20048']]

And this is what n looks like:

['20011', '20048' ,'20011', '20048']

I get a zero on printing out z. What am I doing wrong? If I don't define z, I get an error saying no z defined.

R__raki__
  • 847
  • 4
  • 15
  • 30
minks
  • 2,859
  • 4
  • 21
  • 29

2 Answers2

2

If you want the count if the elements appear anywhere, get the union of all the sublists and see how many times each element from l2 appears in it:

l = [['20011', '20048'],
 ['20011', '20048'],
 ['20011', '20048'],
 ['20011', '20048']]


l2 = ['20011', '20048' ,'20011', '20048']
union = set.union(*map(set,l))
print(sum(ele in union for ele in l2)) # ->  4

If you don't want to count unique elements more than once, get the intersection:

l = [['20011', '20048'],
     ['20011', '20048'],
     ['20011', '20048'],
     ['20011', '20048']]

l2 = ['20011', '20048', '20011', '20048']
inter = set.union(*map(set, l)).intersection(l2)

print(len(inter)) # ->  2

If you want to use the elements from the sublists for the count:

l = [['20011', '20048'],
     ['20011', '20048'],
     ['20011', '20048'],
     ['20011', '20048']]

l2 = ['20011', '20048', '20011', '20048']
st = set(l2)
from itertools import chain
print(sum(ele in st for ele in chain.from_iterable(l)))

To count based on the sublist being disjoint or not from n, you can use set.isdisjoint so if there is any common elements not st.isdisjoint(sub) will be True.:

l = [['20011', '20048'],
     ['20011', '20048'],
     ['20011', '20048'],
     ['20011', '20048']]

l2 = ['20011', '20048', '20011', '20048']
st = set(l2)
print(sum(not st.isdisjoint(sub) for sub in l)) # -> 4
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
1

numpy.intersect1d enables to get the intersection between 2 arrays, then you can use size to get the count of unique elements in both arrays :

import numpy as np
labels = np.array([[20011,20048],[20011,20048],[20011,20048],[20011,20048]])
n = np.array([20011,20048,20011,20048])
z = np.intersect1d(n,labels).size
print z # counts 2

numpy.in1d enables to check if each elements of an 1D array are in a 2nd array, then you convert into list and count the True item (not unique elements) :

z = np.in1d(n,labels).tolist().count(True)
print z # counts 4
cromod
  • 1,721
  • 13
  • 26
  • This gives me a 0. Is it because labels is a list of list while n is just a list? – minks Feb 02 '16 at 16:08
  • I get 2 with your example – cromod Feb 02 '16 at 16:17
  • Hi, this works perfectly. I had some issue in my code. My bad. Thanks a lot! – minks Feb 02 '16 at 16:26
  • you've numpy.in1d too. It provides an array to check if each element of n is in labels. In this case, you'll count 4. – cromod Feb 02 '16 at 16:44
  • Why does the difference exist in the z values then? One gives 2 and the other 4? Which one is the right case? – minks Feb 02 '16 at 16:54
  • It depends on your need. With intersect1d you'll get the number of unique elements that are in both n and labels. With in1d the number of elements n that are in labels. – cromod Feb 02 '16 at 17:01