1

I have a list of elements:

x = [1,3,5,7,9]
y = [2,4,6,8,0]

Now I want to perform an operation which gives me the list in which "MOST OF" the elements in a third list z exist:

z = [2,3,5,7] #primes

I want to do it such that the list with "MOST OF" the items in z is returned,not the list with any element in z..

If it is not possible with lists, I am ready to work with tuples or sets as-well...

EDIT :

sample:

mostOf(z) -> x

since x contains most of the values in z

tenstar
  • 9,816
  • 9
  • 24
  • 45
  • Is most-of defined as `len(z)//2 <= x < len(z)` or something similar? It isn't really clear when you say *not the list will all* – jamylak Jun 06 '13 at 14:02
  • Can you provide sample output, i.e. what you want in the end. "Most of" is not clear to me. – Mike Müller Jun 06 '13 at 14:07
  • @user2433215 See [this](http://stackoverflow.com/questions/183853/in-python-what-is-the-difference-between-and-when-used-for-division). – Aya Jun 06 '13 at 14:08
  • @jamylak sorry, my mistake, if a list with all the elements of z then is the best case – tenstar Jun 06 '13 at 14:09
  • 1
    `mostOf(z) -> x` still doesn't make much sense. One of `x.containsMostOf(z) -> True`, or `z.whichContainsMoreOf(x, y) -> x` would be more meaningful. With the latter, though, what if both contain the same number of equal elements? – Aya Jun 06 '13 at 14:14
  • `z.whichContainsMoreOf(x, y) -> x` is what i'm looking for @Aya – tenstar Jun 06 '13 at 14:17

1 Answers1

4

working with sets, you can look at the size of the intersection ...

 zset = set(z)
 if len(zset.intersection(x)) > len(zset.intersection(y)):
     ...

If you have an iterable of lists to check:

iterable = (x,y)

You can get the iterable with the biggest intersection from1:

def cmp_key(lst):
    itersect_size = len(zset.intersection(lst))
    return intersect_size,-len(lst)

list_with_biggest_intersection = max(iterable,key = cmp_key)

1Stolen from the now deleted answer by Jamylak

mgilson
  • 300,191
  • 65
  • 633
  • 696
  • YES! it works, can you please give me a procedure which works with any number of sets to search in which most of the elements of the set z exist? if so, i'll accept your ans @mgilson – tenstar Jun 06 '13 at 14:13
  • This doesn't make much sense. Surely `len(xset.intersection(y))` is always zero? – Aya Jun 06 '13 at 14:18
  • @Aya -- It was a typo -- `xset` would have raised a `NameError` – mgilson Jun 06 '13 at 14:18
  • please give me a procedure `z.whichContainsMoreOf(x, y) -> x` which can take on any number of parameters and still give me the set which contains most? – tenstar Jun 06 '13 at 14:19
  • @mgilson Ah. Now it makes sense. :) – Aya Jun 06 '13 at 14:19
  • @user2433215 What should it return if, say, both `x` and `y` contain the same number of shared elements with `z`? – Aya Jun 06 '13 at 14:20
  • then anything randomly, just like sql – tenstar Jun 06 '13 at 14:21
  • @mglison so, you say that i can make a function: `mostOf(z,l)` where l is the tuple containing all the sets to search for? rite? – tenstar Jun 06 '13 at 14:22
  • @user2433215 In that case, the second part of this answer ought to work. – Aya Jun 06 '13 at 14:23
  • @user2433215 -- Sure. That would be called `mostOf(z,(x,y))`. If you wanted to call it as `mostOf(z,x,y)` you could have the signature be `mostOf(z,*args)`. – mgilson Jun 06 '13 at 14:24
  • Yes it works, I have accepted your answer, one final note: what are cost issues? is it quite fast for search engines? @mgilson – tenstar Jun 06 '13 at 14:28
  • IIRC, `set.intersection(list)` will be O(n), such that 'n' is `len(list)`. – Aya Jun 06 '13 at 14:31
  • @user2433215 -- I have no idea if it's fast for search engines ... Each intersection should be O(N) (where N is the size of the list being intersected with `zset`). – mgilson Jun 06 '13 at 14:32
  • i'm sorry i should have asked this previously, but when my tuple is: `(['atomic','orbital'],['aloo','bonda'],['atomic','orbital','bonda','aloo'])` it gives me `['atomic','orbital']` but when the tuple is: `(['aloo','bonda'],['atomic','orbital','bonda','aloo'],['atomic','orbital'])` i get the result as `['atomic','orbital','bonda','aloo']` if x and y in my question example contain same number of shared elements with z then how can i make it return the one (out of x and y) which has the least number of elements @mgilson – tenstar Jun 07 '13 at 11:15
  • @user2433215 -- You just need to modify the comparison key to return a `tuple` with `(sizeof_overlap,-len(lst))`. I've modified my answer accordingly. – mgilson Jun 07 '13 at 12:20
  • @mgilson it says: `TypeError: len() takes exactly one argument (2 given)` – tenstar Jun 07 '13 at 15:55
  • @user2433215 -- Hmm ... Must have gotten my parenthesis wrong. At this point, the lambda function is getting a little too big to be easy to read. I factored it out into a separate function to make it easier to see what's going on. – mgilson Jun 07 '13 at 17:22
  • Okay, thanks alot! you've solved my search engine problem, i wanted to find cost, as you said each interse. should be O(N) where N is the size of list being intersected with zset rite? so i was just wonderin' what is O() and how to compute it... @mgilson – tenstar Jun 08 '13 at 05:22