matching algorithm

Question

I'm writing an application which divides a population of users into pairs for the purpose of performing a task together. Each user can specify various preferences about their partner, e.g.

gender
language
age
location (typically, within X miles/kilometers from where the user lives)

Ideally, I would like the user to be able to specify whether each of these preferences is a "nice to have" or a "must have", e.g. "I would prefer to be matched with a native English speaker, but I must not be matched with a female".

My objective is to maximise the overall average quality of the matches. For example, assume there are 4 users in the system, A, B, C, D. These users can be matched in 3 ways:

Option 1     Match Score
A-B           5
C-D           4
---
Average       4.5

Option 2     Match Score
A-C           2
B-D           3
---
Average       2.5

Option 3     Match Score
A-D           1
B-C           9
---
Average       5

So in this contrived example, the 3rd option would be chosen because it has the highest overall match quality, even though A and D are not very well matched at all.

Is there an algorithm that can help me to:

calculate the "match scores" shown above
choose the pairings that will maximise the average match score (while respecting each user's absolute constraints)

It is not absolutely necessary that each user is matched, so given a choice between significantly lowering the overall quality of the matches, and leaving a few users without a match, I would choose the latter.

Obviously, I would like the algorithm that calculates the matches to complete as quickly as possible, because the number of users in the system could be quite large.

Finally, this system of computing match scores and maximising the overall average is just a heurisitic I've come up with myself. If there's a much better way to calculate the pairings, please let me know.

Update

The problem I've described seems to be a similar to the stable marriage problem for which there is a well-known solution. However, in this problem I do not require the chosen pairs to be stable. My goal is to choose the pairs so that the average "match score" is maximized

Could you define "quite large" is it hundreds, thousands, millions? — Ivan, Jan 31 '11 at 11:16
@Ivan ideally, I would like the system to be able to perform the matching algorithm for a few million users in a couple of hours. — Dónal, Jan 31 '11 at 12:01
I'm approaching a similar problem (with a difference where my "men" and "women" must be paired with the opposite gender, and there isn't a balanced population) - I see that you've not accepted a solution - which approach did you go down? — Rowland Shaw, Jan 22 '16 at 15:44

Pontus Gagge · Answer 1 · 2011-01-31T10:59:39.110

2

What maximum match algorithms have you been looking at? I read your question too hastily at first: it seems you don't necessarily restrict yourself to a bipartite graph. This seems trickier.

edited Jan 31 '11 at 10:59

answered Jan 31 '11 at 10:52

Pontus Gagge

17,166
1
38
51

In particular, a min-cost max flow problem might be useful here, where the cost of a match could be inversely proportional to its goodness. – templatetypedef Jan 31 '11 at 11:02

score 1 · Answer 2 · edited Jan 31 '11 at 12:12

1

To find a maximum matching in an arbitrary graph there is a weighted variant of Edmond's matching algorithm:

http://en.wikipedia.org/wiki/Edmonds's_matching_algorithm#Weighted_matching

See the footnotes there.

edited Jan 31 '11 at 12:12

Nikita Rybak

67,365
22
157
181

answered Jan 31 '11 at 11:12

Nubok

3,502
7
27
47

score 1 · Answer 3 · answered Jan 31 '11 at 11:21

1

I believe this problem could be represented as a linear programming problem. And then you can use Simplex method to solve it.

answered Jan 31 '11 at 11:21

Ivan

3,567
17
25

This problem can only be characterized as a linear programming problem if the "Match Score" is a linear function of each user preference. – Dónal Jan 31 '11 at 11:58

score 0 · Answer 4 · edited May 23 '17 at 11:44

I provided a possible solution to a similar problem here. It's an algorithm for measuring dissimilarity--the more similar measured data is to expected data, the smaller your resulting number will be.

For your application you would set a person's preferences as the expected data and each other person you compare against would be the measured data. You would want to filter the 'measured data' to eliminate those cases like "must not be matched with a female", that you mention in your original question, before running the comparison.

Another option could be using a Chi-Square algorithm.

score 0 · Answer 5 · answered Feb 01 '11 at 19:55

By the looks of it your problem is not bipartite, therefore it would seem to me that you are looking for a maximum weight matching in a general graph. I don't envy the task of writing this as Edmond's blossum shrinking algorithm is not easy to understand or implement efficiently. There are implementations of this algorithm out there, one such example being the C++ library LEMON (http://lemon.cs.elte.hu/trac/lemon). If you want a maximum cardinality maximum weight matching you will have to use the maximum weight matching algorithm and add a large weight (sum of all the weights) to each edge to force maximum cardinality as the first priority.

Alternatively as you mentioned in one of the comments above that your match terms are not linear and so linear programming is out, you could always take a constraint programming approach which does not require that the terms be linear.

matching algorithm

Update

5 Answers5