4

First of all, this is not a homework or something like that, this is a suggestive problem from my last question Candidate in finding majority element in an array.

There are n kinds of objects O1, O2, ..., On, and there is an array F[1...n], F[i] is the number of Oi(i.e. there are F[i] Ois, array F[1...n] is given), and every F[i] > 0 .

Now use the following rule to pair objects:

if i != j, Oi can be paired with Oj,
else if i == j, Oi can not be paired with Oj.

i.e., only two different kinds of objects can be paired with each other.

  1. Whether or not a valid pairing method may exist for the input array F[1...n]? Give out an algorithm with best time complexity to tell true or false and proves its correctness.

  2. If there exist, output one valid paring sequence. Give out your algorithm and time/space complexity analysis.

For example, input F[] = {10, 2, 4, 4};

Then there exists at least one valid pairing method:

2 O1s and 2 O2s paired, 4 O1s and 4 O3s paired, 4 O1s and 4 O4s paired.

One valid pairing sequence is:

(1,2) (1,2) (1,3) (1,3) (1,3) (1,3) (1,4) (1,4) (1,4) (1,4)

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
imsrch
  • 1,152
  • 2
  • 11
  • 24

2 Answers2

5

Checking if a solution exists in O(n)

Let s be the sum of F.

  • If s is odd there is no solution (intuitive)
  • If there exists an i such that F[i] > s/2 there is no solution (intuitive)
  • Otherwise, a solution exists (proof by construction follows)

Finding a solution in O(n)

# Find s
s = 0
for i in 1..n:
    s += F[i]

# Find m such that F[m] is maximal
m = 1
for i in 1..n:
    if F[i] > F[m]:
         m = i

if s % 2 != 0 or F[m] > s/2:
    fail

a = 1
b = 1

# Pair off arbitrary objects (except those of type m) until F[m] = s/2    
while s/2 > F[m]:
    # Find a type with a non-zero frequency
    until F[a] > 0 and a != m:
        a = a + 1
    # Find another type with a non-zero frequency
    until F[b] > 0 and b != m and b != a:
        b = b + 1

    count = min(F[a], F[b], s/2 - F[m])
    pair(a, b, count)

# Pair off objects of type m with objects of different types
while F[m] > 0:
    # Find a type with a non-zero frequency
    until F[a] > 0 and a != m:
        a = a + 1
    pair(a, m, F[a])

end of algorithm

def pair(a, b, count):
    # Pairs off 'count' pairs of types a and b
    F[a] = F[a] - count
    F[b] = F[b] - count
    s = s - (2 * count)
    output "Pairing off $a and $b ($count times)"

The two while loops are both linear. The first while loop increments either a or b by at least one at each iteration, because after matching up count pairs either F[a] is zero, or F[b] is zero, or s/2 = F[m] and the loop terminates. a and b can be incremented at most n times each before all the elements have been visited. The second while loop is also linear since it increments a at by at least one during each iteration.

The key invariants are
(1) F[m] is the largest element of F
(2) F[m] <= s/2
I think both are fairly obvious upon inspection.

With the inner loop, as long as s/2 > F[m] there must be at least two other object types with non-zero frequencies. If there was only one, say a, then
F[a] + F[m] = s
F[a] = s - F[m] > s - s/2 (from the loop condition)
F[a] > s/2
F[a] > F[m]
which is a contradiction of invariant (1).

Since there are at least two types with non-zero frequencies (besides m) the loop will be able to find types a and b and pair off objects until s/2 = F[m].

The second loop is trivial. Since exactly half the objects are of type m, each object of type m can be paired with an object of a different type.

tom
  • 21,844
  • 6
  • 43
  • 36
  • Here's an example where your algorithm fails: Items p,q,r,s have frequencies 7,6,3,2 listed in that order in F. The “Pair off arbitrary objects” first pairs q & r with count=3, reducing F[1], F[2] to 3, 0. Then s=12 so the “Pair off objects of type m” loop operates and fails. – James Waldby - jwpat7 Apr 19 '13 at 15:35
  • @jwpat7 it will pair off q & r only 2 times, since `min(F[a], F[b], s/2 - F[m]) = min(6, 3, 18/2 - 7) = 2` – tom Apr 19 '13 at 20:57
  • Ok - I overlooked s/2 - F[m] term – James Waldby - jwpat7 Apr 19 '13 at 22:39
  • @Tom: thanks for providing a counterexample eliciting a crucial flaw in the reasoning for my max-flow algorithm (which, in a nutshell, was lacking a sufficient number of constraints to properly limit the occurrences of individual objects in pairings). it seems that this deficiency cannot be remedied (easily). therefore i have retracted my answer. sorry to everybody who has been led onto the garden path. – collapsar Apr 21 '13 at 12:45
1

Here's one suggestion. I'm not sure if it succeeds for every possible case, though, or if it's the most efficient possible algorithm.

Let n be the total number of indices. Construct a highest-value priority queue of the numbers of object types, where each object type is its index i. In other words, make a priority queue where the sorting values in the queue are the values of F. Associate each node with the list of all objects of that type. This will take O(n log(n)) time.

For each object type, starting with the type that has the most duplicates and proceeding to the type with the fewest, pair one object from that "class" of objects with one object for each of the other classes that still have objects remaining in them, and remove that object from that node in the queue. Since every queue item except the top one will have one fewer object in it, most of the queue will still be in priority-queue order; the top node, however, will have n-1 fewer items (or it will be empty), so heapify down to maintain the queue. Also, remove nodes with no objects left. Repeat this process with the new highest queue value until all nodes are paired. This will take O(n log(n) + k) time, where k is the total number of items. Assuming that k is significantly larger than n, the total time complexity is O(k).

Again, I'm not quite sure that this will always find a solution if one is possible. My intuition is that if you re-heapify (if necessary) after every pairing, you'll ensure that if a full pairing is possible it will be found, but (1) this would be much less efficient, and (2) I'm not sure what cases it would succeed at that the original algorithm wouldn't, and (3) I'm not entirely certain even that would work every time.

As for what values of F have no solution, obviously if there exists one class of objects that has more elements than all the other classes combined, no pairing is possible. Beyond that...I'm not really sure. It would be interesting to investigate whether my "improved" algorithm correctly evaluates every case or not.

Kyle Strand
  • 15,941
  • 8
  • 72
  • 167