Part 1: getting C_list
you will have to create nested lists yourself to append to C_list.
if an item from a can either be a list of strings or a string, you have 2 cases.
def get_A_in_B(a_list:"list[str|list[str]]",b_list:"list[str]"):
c_list = [] # global within this function
# for neatness
def process_base_item(a_str:"str",out_list:"list"):
matches = sorted([b_str for b_str in b_list if b_str.startswith(a_str)])
out_list.extend(matches)
for a_item in a_list: # case 1 - is list, extend nested
if type(a_item) is list:
sublist = a_item
nested_list = []
for sub_item in sublist:
process_base_item(sub_item,nested_list)
if nested_list:
c_list.append(nested_list)
else: # case 2 - is string, extend c list
process_base_item(a_item,c_list)
return c_list
usage:
A_list = ['A', 'B', ['C', 'D'] ]
B_list = ['A1', 'W5', 'X6', 'A2', 'A3', 'T5', 'B0', 'Z9', 'C1', 'W3', 'D1']
C_list = get_A_in_B(A_list,B_list,string_list)
output:
['A1', 'A2', 'A3', 'B0', ['C1', 'D1']]
Part 2: formatting
this will work if 2 assumptions are upheld:
- assuming there is only one of each type of letter in format strings
- assuming if you want to cycle through all possibilities if nested is uneven
e.g. ["C1", "C2", "D1"] => "C1"+"D1", "C2"+"D1"
this was the real tricky part. i used regex to match the letter to the format string.
for C_list
's nested lists, i split them into more sublists by their letter, and then got their cartesian product to input as multiple arguments to the format string.
and same as before, you have 2 cases.
def format_string_list(c_list,string_list):
formatted_string_list = []
for c_item in c_list:
for fmt_str in string_list:
if type(c_item) is list: # case 1 - is list, match multiple
c_sublist = c_item
# assumption 1: letters are unique
first_letters = sorted(set([c_str[0] for c_str in c_sublist]))
matched_letters = []
for letter in first_letters:
pat = f" in {letter}"
if pat in fmt_str:
matched_letters.append(letter)
if first_letters==matched_letters:
# get dictionary of lists, indexed by first letter
c_str_d = {}
for letter in first_letters:
c_str_d[letter] = [c_str for c_str in c_sublist if letter in c_str]
# assumption 2: get all combinations
for c_str_list in itertools.product(*c_str_d.values()):
c_fmtted = fmt_str.format(*c_str_list)
formatted_string_list.append(c_fmtted)
else: # case 2
c_str = c_item
first_letter = c_str[0]
pat = f" in {first_letter}"
if pat in fmt_str:
c_fmtted = fmt_str.format(c_str)
formatted_string_list.append(c_fmtted)
return formatted_string_list
usage:
C_list = ['A1', 'A2', 'A3', 'B0', ['C1', 'D1'] ]
string_list = ["{0} in Alpha", "{0} in Apple", "{0} in Bee", "{0} in Cheese and {1} in Dice"]
formatted_string_list = format_string_list(C_list,string_list)
# print output
print("\n".join(formatted_string_list))
output:
A1 in Alpha
A1 in Apple
A2 in Alpha
A2 in Apple
A3 in Alpha
A3 in Apple
B0 in Bee
C1 in Cheese and D1 in Dice
works on more complex cases too
doesnt go beyond one level nesting, don't think you need it for your case
A_list = ['A', 'B', ['C', 'D', 'E']]
B_list = ['A1', 'W5', 'X6', 'D2', 'E1', 'A2', 'A3', 'T5', 'E2', 'B0', 'Z9', 'C1', 'W3', 'D1']
string_list = ["{0} in Alpha", "{0} in Apple", "{0} in Bee", "{0} in Cheese and {1} in Dice {2} in Egg"]
output:
['A1', 'A2', 'A3', 'B0', ['C1', 'D1', 'D2', 'E1', 'E2']]
A1 in Alpha
A1 in Apple
A2 in Alpha
A2 in Apple
A3 in Alpha
A3 in Apple
B0 in Bee
C1 in Cheese and D1 in Dice E1 in Egg
C1 in Cheese and D1 in Dice E2 in Egg
C1 in Cheese and D2 in Dice E1 in Egg
C1 in Cheese and D2 in Dice E2 in Egg