I have two lists:
list1 = ['abc-21-6/7', 'abc-56-9/10', 'def-89-7/3', 'hij-2-4/9', 'hij-75-1/7']
list2 = ['abc', 'hij']
I would like to subset list1 such that: 1) only those elements with substrings matching an element in list2 are retained, and 2) for duplicated elements that meet the first requirement, I want to randomly retain only one of the duplicates. For this specific example, I would like to produce a result such as:
['abc-21-6/7', 'hij-75-1/7']
I have worked out code to meet my first requirement:
[ele for ele in list1 for x in list2 if x in ele]
Which, based on my specific example, returns the following:
['abc-21-6/7', 'abc-56-9/10', 'hij-2-4/9', 'hij-75-1/7']
But I am stuck on the second step - how to randomly retain only one element in the case of duplicate substrings. I'm wondering if the random.choice function can somehow be incorporated into this problem? Any advice will be greatly appreciated!