8

I'm writing an application which divides a population of users into pairs for the purpose of performing a task together. Each user can specify various preferences about their partner, e.g.

  • gender
  • language
  • age
  • location (typically, within X miles/kilometers from where the user lives)

Ideally, I would like the user to be able to specify whether each of these preferences is a "nice to have" or a "must have", e.g. "I would prefer to be matched with a native English speaker, but I must not be matched with a female".

My objective is to maximise the overall average quality of the matches. For example, assume there are 4 users in the system, A, B, C, D. These users can be matched in 3 ways:

Option 1     Match Score
A-B           5
C-D           4
---
Average       4.5

Option 2     Match Score
A-C           2
B-D           3
---
Average       2.5

Option 3     Match Score
A-D           1
B-C           9
---
Average       5

So in this contrived example, the 3rd option would be chosen because it has the highest overall match quality, even though A and D are not very well matched at all.

Is there an algorithm that can help me to:

  • calculate the "match scores" shown above
  • choose the pairings that will maximise the average match score (while respecting each user's absolute constraints)

It is not absolutely necessary that each user is matched, so given a choice between significantly lowering the overall quality of the matches, and leaving a few users without a match, I would choose the latter.

Obviously, I would like the algorithm that calculates the matches to complete as quickly as possible, because the number of users in the system could be quite large.

Finally, this system of computing match scores and maximising the overall average is just a heurisitic I've come up with myself. If there's a much better way to calculate the pairings, please let me know.

Update

The problem I've described seems to be a similar to the stable marriage problem for which there is a well-known solution. However, in this problem I do not require the chosen pairs to be stable. My goal is to choose the pairs so that the average "match score" is maximized

Nubok
  • 3,502
  • 7
  • 27
  • 47
Dónal
  • 185,044
  • 174
  • 569
  • 824
  • Could you define "quite large" is it hundreds, thousands, millions? – Ivan Jan 31 '11 at 11:16
  • @Ivan ideally, I would like the system to be able to perform the matching algorithm for a few million users in a couple of hours. – Dónal Jan 31 '11 at 12:01
  • I'm approaching a similar problem (with a difference where my "men" and "women" must be paired with the opposite gender, and there isn't a balanced population) - I see that you've not accepted a solution - which approach did you go down? – Rowland Shaw Jan 22 '16 at 15:44

5 Answers5

2

What maximum match algorithms have you been looking at? I read your question too hastily at first: it seems you don't necessarily restrict yourself to a bipartite graph. This seems trickier.

Pontus Gagge
  • 17,166
  • 1
  • 38
  • 51
  • In particular, a min-cost max flow problem might be useful here, where the cost of a match could be inversely proportional to its goodness. – templatetypedef Jan 31 '11 at 11:02
1

To find a maximum matching in an arbitrary graph there is a weighted variant of Edmond's matching algorithm:

http://en.wikipedia.org/wiki/Edmonds's_matching_algorithm#Weighted_matching

See the footnotes there.

Nikita Rybak
  • 67,365
  • 22
  • 157
  • 181
Nubok
  • 3,502
  • 7
  • 27
  • 47
1

I believe this problem could be represented as a linear programming problem. And then you can use Simplex method to solve it.

Ivan
  • 3,567
  • 17
  • 25
  • This problem can only be characterized as a linear programming problem if the "Match Score" is a linear function of each user preference. – Dónal Jan 31 '11 at 11:58
0

I provided a possible solution to a similar problem here. It's an algorithm for measuring dissimilarity--the more similar measured data is to expected data, the smaller your resulting number will be.

For your application you would set a person's preferences as the expected data and each other person you compare against would be the measured data. You would want to filter the 'measured data' to eliminate those cases like "must not be matched with a female", that you mention in your original question, before running the comparison.

Another option could be using a Chi-Square algorithm.

Community
  • 1
  • 1
oosterwal
  • 1,479
  • 8
  • 16
0

By the looks of it your problem is not bipartite, therefore it would seem to me that you are looking for a maximum weight matching in a general graph. I don't envy the task of writing this as Edmond's blossum shrinking algorithm is not easy to understand or implement efficiently. There are implementations of this algorithm out there, one such example being the C++ library LEMON (http://lemon.cs.elte.hu/trac/lemon). If you want a maximum cardinality maximum weight matching you will have to use the maximum weight matching algorithm and add a large weight (sum of all the weights) to each edge to force maximum cardinality as the first priority.

Alternatively as you mentioned in one of the comments above that your match terms are not linear and so linear programming is out, you could always take a constraint programming approach which does not require that the terms be linear.

equivalence
  • 21
  • 1
  • 3