4

I like programming challenges and trying to solve them, however, this one has me a bit stumped and I could really use a point in the right direction.

There is a set of numbers, 1 to N that represent females and N+1 to 2N that represent male. (So if N = 5 then that means 1,2,3,4,5 = female and 6,7,8,9,10 = male.) I am trying to find the maximum amount of matches possible. Here's the catch though, the female numbers are picky and will only pair with numbers on their line. So it looks like this:

19
1: 25
2: 20 25 28
3: 27 32 37
4: 22
5: 32 38
6: 32 34 35
7: 22 34 37
8: 30 35 38
9: 20 23
10: 24 29
11: 29 32
12: 23 26 31
13: 21 25 34
14: 21 27
15: 20
16: 23 31 38
17: 22 27 28
18: 35
19: 24 25

So where "#: " is the girl, you'd say that girl #1 will ONLY pair with guy #25. Likewise, girl #2 will ONLY pair with guys #20,25, or 28.

My solution to this was to make each line into an array and then also make an empty array of values already used. I started with girls who would only pair with one person and gave them the pair. Then I went onto girls who would only pair with someone out of a group of 2. Then finally 3.

This worked decently, however, this meant some people didn't get paired when they could have been, thus my maximum amount of pairs possible wasn't reached due to the first-come-first-serve nature of it.

Using the example above, I got 16 out of 19 (where the maximum possible is 17.) with the 3 non-paired girls being #13, 17, and 19.

Here's is my current code:

def findPairs(women):
    pairs = []
    taken = []
    for woman in range(women):
        data = raw_input()

        #Remove the "#: " in each line and format numbers into ['1','2','3']
        data = (data[len(str(woman))+2:]).split()
        pairs.append(data)

    for entry in pairs:
        if len(entry) == 1:
            if entry[0] not in taken:
                taken.append(entry[0])
            else:
                print("No match for " + str(entry))

    for entry in pairs:
        if len(entry) == 2:
            if entry[0] not in taken:
                taken.append(entry[0])
            elif entry[1] not in taken:
                taken.append(entry[1])
            else:
                print("No match for " + str(entry))

    for entry in pairs:
        if len(entry) == 3:
            if entry[0] not in taken:
                taken.append(entry[0])
            elif entry[1] not in taken:
                taken.append(entry[1])
            elif entry[2] not in taken:
                taken.append(entry[2])
            else:
                print("No match for " + str(entry))

    print(len(taken))
findPairs(input())

Does anyone have any idea how I can better optimize my pair matching so that I can reach maximum pairs? (Preferably a way that can expand as well, finding a hack-ish way to find the answer for these 19 isn't too rewarding if it doesn't work for a different set of say 50 girls. It's fine to assume however the format will stay the same (ie: 3 guys max per line, lesser to greater order, etc..) although it would be cool to be able to make it as expandable as possible, that's besides the point and beyond the reach of this question.)

thefourtheye
  • 233,700
  • 52
  • 457
  • 497
Vale
  • 1,003
  • 1
  • 9
  • 22
  • 1
    Hint: try flipping the question around and looking at guy+girl pairs; but as you are not facing a problem but rather need optimization hints - this is better suited for codereview.stackexchange.com – Burhan Khalid Feb 22 '15 at 04:33
  • 2
    @BurhanKhalid: This code doesn't work. It doesn't find a maximum matching. – user2357112 Feb 22 '15 at 04:35
  • 1
    Check out: http://www-math.mit.edu/~goemans/18433S09/matching-notes.pdf – Nir Alfasi Feb 22 '15 at 04:36
  • The code works, it just needs optimization. Optimization questions are best asked on codereview.stackexchange.com - because the question boils down to - "I have working code, I need to make it better". – Burhan Khalid Feb 22 '15 at 04:36
  • 2
    @BurhanKhalid: No, look again. The OP is just using the word "optimize" weirdly. – user2357112 Feb 22 '15 at 04:38
  • 2
    @BurhanKhalid user235... is right. The OP's "algorithm" doesn't work - it doesn't find maximum matching: "I can better optimize my pair matching **so that I can reach maximum pairs**?" - indeed, that's an invalid use of the word "optimize"... :) – Nir Alfasi Feb 22 '15 at 04:41
  • 2
    See Wikipedia's article on [matchings](http://en.wikipedia.org/wiki/Matching_%28graph_theory%29), particularly the section on [finding maximal matchings in unweighted bipartite graphs](http://en.wikipedia.org/wiki/Matching_%28graph_theory%29#In_unweighted_bipartite_graphs). – user2357112 Feb 22 '15 at 04:56
  • Ignore that, i just counted again. – Mark Nicolle Feb 22 '15 at 05:12
  • @MarkNicolle I can assure you, there is only 17. Can you post your solution though? I'd love to see a new approach to the problem. – Vale Feb 22 '15 at 05:14
  • How did you come up with 18, there are only 19 females and 19 males but the males numbered 33 and 36 are not in the list of possible male matches, so maximum is 17. – AChampion Feb 22 '15 at 05:15
  • 1
    Yeah i was thinking there was 20 ha ha i'm working on something now for you – Mark Nicolle Feb 22 '15 at 05:16
  • [This question](http://stackoverflow.com/questions/4697228/hopcroft-karp-algorithm-in-python) implements the Hopcroft-Karp algorithm in Python. It uses `networkx`, but it looks like it should be fairly easy to modify it to just use standard modules and your own custom implementation for the bipartite graph. – PM 2Ring Feb 22 '15 at 06:17

2 Answers2

3

The following is an unoptimized implementation of finding a maximum-matching in a bipartite graph that iterates all the unmatched women and tries to change the current matching by pairing each of them with one of her candidates as follows:

  • if there's a "free" candidate - add the match to the graph
  • if all the candidates are already paired, "un-pair" one of them and match him to her, then start an iterative process of going to the woman that this man was already paired with, marking her as the "unmatched" and call recursively.

I named this process "relax" because it somewhat reminds the recursive relaxation step in Dijkstra's algorithm.

Here's the code:


import random

def read_file():
    res = {}
    start = True
    with open('pairs.txt', 'r') as f:
        for line in f.readlines():
            if start:
                start = False
                continue
            woman, matches = line.strip().split(': ')
            woman = int(woman)
            matches = map(int, matches.split(' '))
            res[woman] = matches
    return res


def build_random_match(graph):
    edges = {}
    for woman in graph:
        for man in graph[woman]:
            if already_in_edges(man, edges):
                continue
            else:
                edges[woman] = man
                break
    return edges


def already_in_edges(man, edges):
    for woman in edges:
        if edges[woman] == man:
            return True
    else:
        return False


def get_unmatched_women(match, graph):
    return  [woman for woman in graph.keys() if woman not in match.keys()]


def not_in_match(man, match):
    for woman in match:
        if match[woman] == man:
            return False
    else:
        return True


def find_unmatched_man(graph, match, woman):    
    potentials = graph[woman]
    for man in potentials:
        if not_in_match(man, match):
            return man
    else:
        return False


def remove_man_from_match(man, unmatched_woman, match, graph):  
    # find the woman that this man is currently matched with
    # and cancel this matching
    for woman in match:
        if match[woman] == man:
            match_to_del = woman
            break   
    del match[match_to_del]
    # also remove the man from the orig woman (graph) 
    # to prevent infinite loop
    men = graph[unmatched_woman]
    men.remove(man)
    graph[unmatched_woman] = men

    return match_to_del


def relax(unmatched_woman, match, graph):   
    unmatched_man = find_unmatched_man(graph, match, unmatched_woman)
    if unmatched_man:
        match[unmatched_woman] = unmatched_man      
    elif len(graph[unmatched_woman]) == 0:
        return match
    else:
        # grab one of the possible matchings randomly
        rand_index = random.randint(0, len(graph[unmatched_woman])-1)
        man = graph[unmatched_woman][rand_index]
        new_unmatched_woman = remove_man_from_match(man, unmatched_woman, match, graph)
        match[unmatched_woman] = man
        match = relax(new_unmatched_woman, match, graph)

    return match


def improve_match(match, graph):
    if len(match) == len(graph):
        return match

    unmatched_women = get_unmatched_women(match, graph) 
    for woman in unmatched_women:
        copy_graph = graph.copy()
        suggested = relax(woman, match, copy_graph)
        if len(suggested) > len(match):
            return suggested
        else:
            suggested = match
    else:
        return suggested


def main():
    graph = read_file()
    match = build_random_match(graph)   
    if len(match) == len(graph):
        print 'Got a perfect match:', match
    else:
        match_size = 0
        while match_size < len(match):
            match_size = len(match)
            match = improve_match(match, graph)

    return match

if __name__ == '__main__':
    res = main()    
    print "Size of match:", len(res)
    print "Match:", res

OUTPUT:

Size of match: 17
Match: {2: 28, 3: 32, 4: 22, 5: 38, 6: 34, 7: 37, 8: 30, 9: 23, 10: 24, 11: 29, 12: 26, 13: 21, 15: 20, 16: 31, 17: 27, 18: 35, 19: 25}
Nir Alfasi
  • 53,191
  • 11
  • 86
  • 129
0

I'm not a Python developer but this is the way i work it out.

Bring in the data as a list of lists. The use the number of the girl as the index. ie girl 1 is index 0, within that index store the list of guys. Create a giant list of lists for the data.

Iterate through the list of list to find the single matchups where the girl is matched up with only one guy. Store the single matchups (the guy) in a list to iterate over. Then Iterate through the list of list removing the single matchup guys from the list of list. Then start again. What you're aiming for is reducing the list to a single element and keeping track of where you are in the list..

Mark Nicolle
  • 307
  • 1
  • 11