0

How can I compare two lists together, and create an output list where common items are shifted to match in index and name. The main list is made once and stays the same throughout the script.

There can be situations where the changing list will have items that do not exist in the main list, I'd like to create a separate list for these items...

Example:

main_list = ['apple', 'orange', 'banana', 'pear', 'mango', 'peach', 'strawberry']
changing_list = ['apple', 'banana', 'cucumber', 'peach', 'pear', 'fish']

output = ['apple', 'NA', 'banana', 'pear', 'NA', 'peach', 'NA']
added_output = ['cucumber', 'fish']

Using the sorted() function on each list before comparison may be of some use, however, I can't get my head around indicating that 'orange', for example is missing (preferably by using NA or X). I am aware of the option of using, sets and the '&' operator, however, using this does not indicate which item was missing with an index/positioning perspective (the NA part)

denov
  • 11,180
  • 2
  • 27
  • 43
camerond12
  • 61
  • 6
  • Possible duplicate of [Ordered intersection of two lists in Python](https://stackoverflow.com/questions/23529001/ordered-intersection-of-two-lists-in-python) – denov Jan 30 '18 at 03:35

3 Answers3

1

You can do this with sets and list comprehensions:

def ordered_intersection(main_list, changing_list):
    changing_set = set(changing_list)
    output = [x if x in changing_set else 'NA' for x in main_list]

    output_set = set(output)
    added_output = [x for x in changing_list if x not in output_set]

    return output, added_output

Which works as follows:

>>> main_list = ['apple', 'orange', 'banana', 'pear', 'mango', 'peach', 'strawberry']
>>> changing_list = ['apple', 'banana', 'cucumber', 'peach', 'pear', 'fish']
>>> ordered_intersection(main_list, changing_list)
(['apple', 'NA', 'banana', 'pear', 'NA', 'peach', 'NA'], ['cucumber', 'fish'])

Explanation of above code:

  • First convert changing_list to a set, since set membership is constant time, as opposed to list membership which is linear time.
  • Since we want to maintain the order of main_list into output, we have to traverse all the elements in that list, and check if they exist in changing_set. This prevents quadratic time complexity for each operation, and allows linear behavior instead.
  • The above logic is also applied to added_output.
RoadRunner
  • 25,803
  • 6
  • 42
  • 75
0

Assuming that you don't care about duplicates, you can use sets to do this to find the differences efficiently:

output=[]
main_set, changing_set = set(main_list), set(changing_list)
for i in main_list:
    output.append(i if i not in changing_set else "NA")
added_output = changing_set - main_set
Turn
  • 6,656
  • 32
  • 41
0

The following approach works to match two lists by index and name

>>> main_list = ['apple', 'orange', 'banana', 'pear','mango', 'peach', 
'strawberry']
>>> changing_list = ['apple', 'banana', 'cucumber', 'peach', 'pear', 'fish']
>>> output = []
>>> for word in main_list:
...     if word in changing_list:
...             output.append(word)
...     else:
...             output.append('NA')
...
>>> output
['apple', 'NA', 'banana', 'pear', 'NA', 'peach', 'NA']

>>> added_output = []
>>> for word in changing_list:
...     if word not in main_list:
...             added_output.append(word)
...
>>> added_output
['cucumber', 'fish']
Ashok Kumar Jayaraman
  • 2,887
  • 2
  • 32
  • 40