0

I've been trying to work with two lists in Python 2.7. I've come part way, but spending some time searching hasn't brought up much in the way of results.

List1: Is a list of specific numbers sequences that I was searching within List2. (e.g.) ['209583', '185372', '684392', '995423']

List2: Has a variation of these numbers from list1. (e.g.) ['209583_345829', '57185372', '853921864']

Now I can match and pull with what I found below... But I was also looking for the inverse; set a variable to all the numbers in List1 that are not in List2.

matching = [s for s in list2 if any(xs in s for xs in list1)]

So what should be left in a non matching variable would be '995423'. I've tried reworking the code above but I feel like it's right under my nose.

Also, would it not be beneficial to just use an If/Else statement for performance reasons? E.g. If matching do this, else not matching do this... That way it is only running once vs twice.

This is a simple example, but the lists for both could push over 10,000 lines per.

Thanks!

sdavis891
  • 101
  • 1
  • 3
  • 10
  • so... just reverse the condition, I guess? `non_matching = [s for s in list1 if not any(xs in s for xs in list2)]` (<- if **not** any) – yedpodtrzitko Mar 15 '16 at 12:37
  • I had already tried this. This returns those from list1 that do not have any reference from list2. I'm looking for those from list2 that do not have any reference from list1. – sdavis891 Mar 15 '16 at 13:07
  • This is significantly more complicated in reverse because of the fact that the fact that the number sequences in 1 can be located anywhere in the numbers of 2. Question: Are the numbers in list 1 reliably six digits long? – MutantOctopus Mar 15 '16 at 13:10
  • My purposes right now... They can vary between 6-7 digits in length. But I was hoping this could be re-purposed in the future for other uses. Such as other lists with alpha instead of numeric. – sdavis891 Mar 15 '16 at 13:14
  • oddly when I run your list comprehension in the python 3 interactive it returns an empty list, even though it doesn't seem like anything involved there is different between 2.7 and 3... Edit: I think you want to invert the variables: `s in xs` – MutantOctopus Mar 15 '16 at 13:20
  • why not using set operations ? – Ali SAID OMAR Mar 15 '16 at 13:54
  • @AliSAIDOMAR Would you be able to provide an example? I'm not too familiar with set operations. – sdavis891 Mar 15 '16 at 14:25
  • set(list1) - set(matching), item in list 1 but not matching your criteria – Ali SAID OMAR Mar 15 '16 at 14:52
  • @AliSAIDOMAR This provided the same blank list as Bhustus' below. – sdavis891 Mar 15 '16 at 15:20
  • 1
    are you using the same test lists that you posted here? Because both my (ultimately incorrect) answer *and* @AliSAIDOMAR's set technique should at least return *something* in the resultant list/set. – MutantOctopus Mar 15 '16 at 15:37
  • My apologies @BHustus and AliSAIDOMAR I had flipped the order of my lists in the original formula listed. I corrected them now... It does return results, but the results returned is List1 regurgitated. – sdavis891 Mar 15 '16 at 19:17
  • Well now it makes even less sense, the new one is pulling items from List2, not list1. Like I said before, I think your previous algorithm was right, but you wanted `s in xs` (List1 item in List2 item), not `xs in s` (List2 item in List1). – MutantOctopus Mar 15 '16 at 19:29
  • I was under the impression that you want elements from List1 that have matches in List2, but I realize now that you never specified in the question; Do you want the List1 items or List2? And more relative to the answer, do you want the non-matches from List1 or List2? – MutantOctopus Mar 15 '16 at 19:41

2 Answers2

0

Your "matching" as written gives the values from list2, not from list1: ['209583_345829', '57185372']

That's why the 'set' approach as described didn't work. You need to rewrite the matching so that it returns the items from list1 that have some corresponding value in list2.

Given the description of your problem, this should work:

non_match = [xs for xs in list1 if not any (xs in s for s in list2)]

However, that returns ['684392', '995423']. I don't see 684392 in list2 anywhere; have you edited the lists at some point, or are you looking for anything in list2 containing all of the digits of the item from list1 rather than just the item itself?

  • Thanks for the answer @A. Leistra Bhustus had managed to help me out. This was exactly what I went with just with different identifiers. notmatching = [c for c in list2 if not any(c in xc for xc in matching)]. *My Lists were different for my final outcome thats why List1-2 are switched. – sdavis891 Mar 16 '16 at 14:05
0

First things first: The list comprehension you have at hand is faulty. To accomplish a list full of items in List1 that have matches in List2, you want to use this:

All items FROM List1 WITH matches in List2

matches = [item for item in List1 if any(item in compared for compared in List2)]

To explain:
[s for s in List1 if any(xs in s for xs in List2)] - Your original algorithm was pulling elements s from List1 and elements xs from List2, and trying to see if xs was contained in s, which is inherently the opposite of what we want to do.

[s for s in list2 if any(xs in s for xs in list1)] - Your new algorithm has inverted the wrong variables. Now it is pulling s from List2 and xs from List1 and checking if xs is in s - which is closer to the original idea. The only problem is, the way your algorithm is set up, it will place the items from List2 into the list if they have a match in List1 (which might be what you want after all?)

[item for item in List1 if any(item in compared for compared in List2)] - Made a bit more verbose for easy reading, this algorithm will pull out items from List1, check if they have a 'container' in List2, and add them to the list if they do. (Side note: an alternative list comprehension that will return the same results is [item for item in List1 for compared in List2 if item in compared], which is a bit more intuitive to read.)

With that out of the way: If you want to get every item from List1 that does not have a match in List2, you can use the algorithm I specified above to gain the matches list, and then, as Ali SAID OMAR stated in a comment, use set operations:

All items IN List1 WITHOUT matches in List2 - Set operation

nomatches = set(List1) - set(matches)

This will take all unique elements of List1, remove the matched elements, and return a set object with all the unmatched elements remaining. Alternatively, if you want a solution in one statement:

All items IN List1 WITHOUT matches in List2 - List comprehension

nomatches = [item for item in List1 if not any(item in compared for compared in List2)]

To give credit where credit is due, this is identical to yedpodtrzitko's solution in the post comments.

Since it's hard to tell what you're asking, though, and in comments you have flip-flopped what you're asking at least once, I will add two more algorithms:

All items IN List2 WITH matches in List1

matches2 = [item for item in List2 for key in List1 if key in item]

All items IN List2 WITHOUT matches in List1 - List Comprehension

nomatches2 = [item for item in List2 if not any(key in item for key in List1)]

All items IN List2 WITHOUT matches in List1 - Set Operation

nomatches2 = set(List2) - set(matches2)

Each of these has been tested through your test case described in your post, and returned the expected results. If these algorithms don't do what you need them to, please double-check that it isn't a problem on your end, and if this doesn't answer your question, please make sure you are clear with what you're asking. Thanks.

MutantOctopus
  • 3,431
  • 4
  • 22
  • 31
  • By explaining all the different scenarios I was able to mix-and-match what you posted to do what I wanted. My original code did what it was suppose to. However the list I had posted was reversed in what I was using. SO from this example posted: matching = [s for s in list2 if any(xs in s for xs in list1)] I was looking for ALL in list1 that match something from list2. Then I used this to find all from list1 that did not match anything from my original results. notmatching = [c for c in list1 if not any(c in xc for xc in matching)] I hope this makes sense! – sdavis891 Mar 16 '16 at 14:00
  • I'm still not entirely certain I get which list of things you wanted, but I'm glad I managed to get you to your solution. :) – MutantOctopus Mar 16 '16 at 14:02
  • Believe me I was confusing myself haha. Python is still new to me, and the only way I seem to learn is by doing. Thanks a lot! – sdavis891 Mar 16 '16 at 14:03
  • Word of advice for the future: when posting a question, you might find it helpful to use more descriptive variables. It might help you and readers determine more wholly what you're trying to accomplish. – MutantOctopus Mar 16 '16 at 14:04