1

I am looking for an algorithm that reduces a list of tuples cyclical by reconstructing a given set as pattern.

Each tuple contains an id and a set, like (1, {'xy'}).

Example

query = {'xyz'}

my_dict = [(1, {'x'}), (2, {'yx'}), (3, {'yz'}),
           (4, {'z'}), (5, {'x'}), (6, {'y'}), 
           (7, {'xyz'}), (8, {'xy'}), (9, {'x'}),]

The goal is to recreate the pattern xyz as often as possible given the second value of the tuples in my_dict. Remaining elements from which the query set can not be completely reconstructed shall be cut off, hence 'reduce'.

my_dict contains in total: 6 times x, 5 times y, 3 times z.

Considering the my_dict, valid solutions would be for example:

result_1 = [(7, {'xyz'}), (8, {'xy'}), (4, {'z'}), (1, {'x'}), (3, {'yz'})]
result_2 = [(7, {'xyz'}), (2, {'yx'}), (4, {'z'}), (1, {'x'}), (3, {'yz'})]
result_3 = [(7, {'xyz'}), (9, {'x'}), (6, {'y'}), (4, {'z'}), (1, {'x'}), (3, {'yz'})]

The order of the tuples in the list is NOT important, i sorted them in the order of the query pattern xyz for the purpose of illustration.

Goal

The goal is to have a list of tuples where the total number of occurrences of the elements from the query set is most optimal evenly distributed.

result_1, result_2 and result_3 all contain in total: 3 times x, 3 times y, 3 times z.

Does anyone know a way/ approach how to do this?

Thanks for your help!

1 Answers1

1

Depending on your application context, a naive brute-force approach might be enough: using the powerset function from this SO answer,

def find_solutions(query, supply):
    for subset in powerset(supply):   
        if is_solution(query, subset):
            yield subset

You would need to implement a function is_solution(query, subset) that returns True when the given subset of the supply (my_dict.values()) is a valid solution for the given query.

Florian Brucker
  • 9,621
  • 3
  • 48
  • 81