Can all objects be paired in a sequence?

Question

First of all, this is not a homework or something like that, this is a suggestive problem from my last question Candidate in finding majority element in an array.

There are n kinds of objects O₁, O₂, ..., O_n, and there is an array F[1...n], F[i] is the number of Oi(i.e. there are F[i] O_is, array F[1...n] is given), and every F[i] > 0 .

Now use the following rule to pair objects:

if i != j, O_i can be paired with O_j,
else if i == j, O_i can not be paired with O_j.

i.e., only two different kinds of objects can be paired with each other.

Whether or not a valid pairing method may exist for the input array F[1...n]? Give out an algorithm with best time complexity to tell true or false and proves its correctness.
If there exist, output one valid paring sequence. Give out your algorithm and time/space complexity analysis.

For example, input F[] = {10, 2, 4, 4};

Then there exists at least one valid pairing method:

2 O₁s and 2 O₂s paired, 4 O₁s and 4 O₃s paired, 4 O₁s and 4 O₄s paired.

One valid pairing sequence is:

(1,2) (1,2) (1,3) (1,3) (1,3) (1,3) (1,4) (1,4) (1,4) (1,4)

@KyleStrand You don not need to calculate the array, the array is given. Array `F[]` is the input of your algorithm. — imsrch, Apr 19 '13 at 05:41
I meant "what does the array have to do with the question," but your first edit clarified it. — Kyle Strand, Apr 19 '13 at 05:42
@KyleStrand I have re-edited the problem and given an example. — imsrch, Apr 19 '13 at 05:54
This miiiiiight be a question better suited for theoretical comp sci: http://cstheory.stackexchange.com/ — Kyle Strand, Apr 19 '13 at 06:00

tom · Accepted Answer · 2013-04-21T01:40:34.037

Checking if a solution exists in O(n)

Let s be the sum of F.

If s is odd there is no solution (intuitive)
If there exists an i such that F[i] > s/2 there is no solution (intuitive)
Otherwise, a solution exists (proof by construction follows)

Finding a solution in O(n)

# Find s
s = 0
for i in 1..n:
    s += F[i]

# Find m such that F[m] is maximal
m = 1
for i in 1..n:
    if F[i] > F[m]:
         m = i

if s % 2 != 0 or F[m] > s/2:
    fail

a = 1
b = 1

# Pair off arbitrary objects (except those of type m) until F[m] = s/2    
while s/2 > F[m]:
    # Find a type with a non-zero frequency
    until F[a] > 0 and a != m:
        a = a + 1
    # Find another type with a non-zero frequency
    until F[b] > 0 and b != m and b != a:
        b = b + 1

    count = min(F[a], F[b], s/2 - F[m])
    pair(a, b, count)

# Pair off objects of type m with objects of different types
while F[m] > 0:
    # Find a type with a non-zero frequency
    until F[a] > 0 and a != m:
        a = a + 1
    pair(a, m, F[a])

end of algorithm

def pair(a, b, count):
    # Pairs off 'count' pairs of types a and b
    F[a] = F[a] - count
    F[b] = F[b] - count
    s = s - (2 * count)
    output "Pairing off $a and $b ($count times)"

The two while loops are both linear. The first while loop increments either a or b by at least one at each iteration, because after matching up count pairs either F[a] is zero, or F[b] is zero, or s/2 = F[m] and the loop terminates. a and b can be incremented at most n times each before all the elements have been visited. The second while loop is also linear since it increments a at by at least one during each iteration.

The key invariants are
(1) F[m] is the largest element of F
(2) F[m] <= s/2
I think both are fairly obvious upon inspection.

With the inner loop, as long as s/2 > F[m] there must be at least two other object types with non-zero frequencies. If there was only one, say a, then
F[a] + F[m] = s
F[a] = s - F[m] > s - s/2 (from the loop condition)
F[a] > s/2
F[a] > F[m]
which is a contradiction of invariant (1).

Since there are at least two types with non-zero frequencies (besides m) the loop will be able to find types a and b and pair off objects until s/2 = F[m].

The second loop is trivial. Since exactly half the objects are of type m, each object of type m can be paired with an object of a different type.

Here's an example where your algorithm fails: Items p,q,r,s have frequencies 7,6,3,2 listed in that order in F. The “Pair off arbitrary objects” first pairs q & r with count=3, reducing F[1], F[2] to 3, 0. Then s=12 so the “Pair off objects of type m” loop operates and fails. — James Waldby - jwpat7, Apr 19 '13 at 15:35
@jwpat7 it will pair off q & r only 2 times, since `min(F[a], F[b], s/2 - F[m]) = min(6, 3, 18/2 - 7) = 2` — tom, Apr 19 '13 at 20:57
@Tom: thanks for providing a counterexample eliciting a crucial flaw in the reasoning for my max-flow algorithm (which, in a nutshell, was lacking a sufficient number of constraints to properly limit the occurrences of individual objects in pairings). it seems that this deficiency cannot be remedied (easily). therefore i have retracted my answer. sorry to everybody who has been led onto the garden path. — collapsar, Apr 21 '13 at 12:45

score 1 · Answer 2 · answered Apr 19 '13 at 05:58

Here's one suggestion. I'm not sure if it succeeds for every possible case, though, or if it's the most efficient possible algorithm.

Let n be the total number of indices. Construct a highest-value priority queue of the numbers of object types, where each object type is its index i. In other words, make a priority queue where the sorting values in the queue are the values of F. Associate each node with the list of all objects of that type. This will take O(n log(n)) time.

For each object type, starting with the type that has the most duplicates and proceeding to the type with the fewest, pair one object from that "class" of objects with one object for each of the other classes that still have objects remaining in them, and remove that object from that node in the queue. Since every queue item except the top one will have one fewer object in it, most of the queue will still be in priority-queue order; the top node, however, will have n-1 fewer items (or it will be empty), so heapify down to maintain the queue. Also, remove nodes with no objects left. Repeat this process with the new highest queue value until all nodes are paired. This will take O(n log(n) + k) time, where k is the total number of items. Assuming that k is significantly larger than n, the total time complexity is O(k).

Again, I'm not quite sure that this will always find a solution if one is possible. My intuition is that if you re-heapify (if necessary) after every pairing, you'll ensure that if a full pairing is possible it will be found, but (1) this would be much less efficient, and (2) I'm not sure what cases it would succeed at that the original algorithm wouldn't, and (3) I'm not entirely certain even that would work every time.

As for what values of F have no solution, obviously if there exists one class of objects that has more elements than all the other classes combined, no pairing is possible. Beyond that...I'm not really sure. It would be interesting to investigate whether my "improved" algorithm correctly evaluates every case or not.

Can all objects be paired in a sequence?

2 Answers2

Checking if a solution exists in O(n)

Finding a solution in O(n)