0

Suppose I have two lists of indices

letters = ['a', 'c']
numbers = ['1','2','6']

These lists are generated based on an interactive web interface element and is a necessary part of this scenario.

What is the most computationally efficient way in python I can use these two lists to search through the third list below for items?

list3 = ['pa1','pa2','pa3','pa4','pa5','pa6',
         'pb1','pb2','pb3','pb4','pb5','pb6',
         'pc1','pc2','pc3','pc4','pc5','pc6',
         'pd1','pd2','pd3','pd4','pd5','pd6']

Using the letters and numbers lists, I want to search through list3 and return this list

sublist = ['pa1', 'pa2, 'pa6', 'pc1', 'pc2', 'pc6']

I could do something like this:

sublist = []
for tag in list3:
    for l in letters:
        for n in numbers:
            if l in tag and n in tag:
                sublist.append(tag)

But I'm wondering if there's a better or more recommended way?

Han Zhang
  • 3
  • 3
  • 1
    Take a look at [itertools.product](https://docs.python.org/3/library/itertools.html#itertools.product) and [sets](https://www.w3schools.com/python/python_sets.asp). Itertools product can quickly generate the combination `pa1`, `pa2` etc ... and the sets can be used to quickly test membership (is an item contained in a set). The main update is that `set` memberships tests are O(1), take only 1 cycle, while `list` membership take O(n), the number of elements in the list, cycles. – Thymen May 25 '21 at 22:51
  • Thank you for this insight. Followup question, why is it that set membership tests take O(1) whereas list membership tests take O(n)? – Han Zhang May 27 '21 at 07:22
  • See the SO questions: [What makes sets faster than lists](https://stackoverflow.com/questions/8929284/what-makes-sets-faster-than-lists/8929445). – Thymen May 27 '21 at 10:12

2 Answers2

0

Most of all, do not iterate through the character lists. Instead, use simple in operations; use any or all operations where necessary. In this case, since your tags are all of the form p[a-d][0-9], you can directly check the appropriate character:

for tag in list3:
    if tag[1] in "ac" and tag[2] in "126":
        sublist.append(tag)

For many uses or a generalized case, replace the strings with sets for O(1) time:

letter = set('a', 'c')
number = set('1', '2', '6')
for tag in list3:
    if tag[1] in letter and tag[2] in number:
        sublist.append(tag)

Next, get rid of the O(n^2) append series (adding to a longer list each time). Replace it with a list comprehension.

sublist = [tag for tag in list3 
               if tag[1] in letter and tag[2] in number]

If the letters and numbers can appear anywhere in the list, then you need a general search for each: look for an overlap in character sets:

sublist = [tag for tag in list3 
               if any(char in letter for char in tag) and
                  any(char in number for char in tag)
          ]

or with sets:

sublist = [tag for tag in list3 
               if set(tag).intersection(letter) and
                  set(tag).intersection(number)
          ]
Prune
  • 76,765
  • 14
  • 60
  • 81
  • Thank you. Forgot I could pick out chars from strings using list notation in python. This is an excellent answer and well explained. – Han Zhang May 27 '21 at 07:20
0

Try this, in simple way

letters = ['a', 'c']
numbers = ['1','2','6']
sublist = []
result = []

list3 = ['pa1','pa2','pa3','pa4','pa5','pa6',
         'pb1','pb2','pb3','pb4','pb5','pb6',
         'pc1','pc2','pc3','pc4','pc5','pc6',
         'pd1','pd2','pd3','pd4','pd5','pd6']

for letter in letters:
    sublist.extend(list(filter(lambda tag: letter in tag, list3)))

for number in numbers:
    result.extend(list(filter(lambda tag: number in tag, sublist)))

print(result)