2

Given an array L containing N elements, from an alphabet of M distinct elements, is there an effective way to calculate the minimum number of swaps of two adjacent elements such that we group all the equal elements together, knowing that for any index i in [1, n], there is less than 9 distinct elements e from M such that there exists j > i and k < i for which L[j] = L[k] = e (so for any position in the array, there is less than 9 groups that have at least an element before, and one after)

For example, given an array L = {1, 2, 3, 1, 2, 3, 2, 2, 1}, we want to arrive at {1,1,1,2,2,2,2,3,3} or {2,2,2,2,3,3,1,1,1} etc...

This problem is also equivalent to finding an optimal order for the elements of M. If we can find such an order, the problem can be solved by simply finding the minimum number of swaps to sort the array given that order, which can be done in O(NlogN) (see here).

rmlb
  • 21
  • 1
  • What have you tried so far, or where does this problem come from? Using the technique from your link gives an `O(M!)` algorithm; presumably that's too slow for your constraints? – kcsquared Mar 04 '22 at 07:46
  • The problem is from a french algorithmic contest https://prologin.org/train/2021/semifinal/trier_une_file with constraints M <= 100 and N<=10000. I couldn't find a more efficient way than the naive O(M!) one – rmlb Mar 04 '22 at 14:51
  • I think a greedy algorithm might work, but I haven't proved it. Basically, for every pair of distinct elements `x, y`, calculate `cost(x,y)` = number of swaps to put all occurrences of `x` before all occurrences of `y`, ignoring the presence of other elements. You also calculate `cost(y, x)`. Then, for any starting ordering of the alphabet `M`, I *think* that you can reach a minimum cost ordering by repeatedly greedily swapping any adjacent pair `x, y` to `y, x` if cost(y,x) < `cost(x,y)`. – kcsquared Mar 04 '22 at 15:12
  • 1
    @kcsquared doesn't that algorithm diverge on, e.g., `(0, 1, 1, 2, 0, 0, 1)`? – David Eisenstat Mar 05 '22 at 15:41
  • @DavidEisenstat Not from what I can tell; for all 6 permutations of the alphabet `(0, 1, 2)`, any locally-good swaps *within these 6 permutations* will give an alphabet-ordering of cost 7. However your example is a case where there are no locally-good swaps *in the original sequence*. To be fair, my description was too short to fully and clearly describe an algorithm. I'll try to post a full answer and description if it works, but as the question has no alternate code or ways of generating valid test input, it will take some time. – kcsquared Mar 05 '22 at 16:01
  • @kcsquared I tried to test this with all possible inputs for n=10 and m=3 and with a solver to check against that tries all 6 possible permutations and I found a counterexample : for L=(0,0,0,2,2,1,0,0,2) and starting at (1,0,2), we have a total cost of 11 permutations, and no adjacent (or non adjacent with 1<->2) swap that gives a better cost. But there is (2,1,0) that has a cost of 10 – rmlb Mar 06 '22 at 16:51
  • @rmlb How are those costs 11 and 10? I get [9 and 15](https://tio.run/##VY3BCsIwEETv@Yq9mUAOWi9S6Mlr/0A8aEgx0GSX7abg18e0RbE7lxmGfUNveWE6X4hLGRgjOBxH7yRgmiBEQha4Yk7iWa19qE4Qx1/rMD5DeqwfFshzzLIlpfpOH@2ipuq0OaOUE4bui9X/AN1baIxRAzIQhLTj6cmL7o1pFdQjDkk0WTg4nKQ9WJhy1BV9my3kOyyIbGFeKLsJWidMKR8) (manually, too). – Pychopath Mar 06 '22 at 18:57
  • My bad! Forgot a zero, L=(0,0,0,2,2,1,0,0,0,2) – rmlb Mar 06 '22 at 19:27
  • Then I agree with the 11, but get for (2,1,0) I get 16 instead of 10. Maybe we use different notations? I get 10 for (0,2,1). I don't see how that relates to your (2,1,0), though. – Pychopath Mar 06 '22 at 20:08
  • Again, my mistake, I printed the wrong variable and didn't check...11 for (1,0,2) and 10 for (0,2,1), and there is no swap that gets to the latter from the first – rmlb Mar 06 '22 at 20:23
  • The greedy strategy as I proposed it is definitely wrong, as that example shows, since there are cases when adjacent swaps don't change the cost, but must be done anyways to reach a minimum permutation. Since the question is really asking for the [minimum weighted feedback arc set](https://en.wikipedia.org/wiki/Feedback_arc_set) on a slightly restricted class of graph, exploring some of the exact solutions to that problem (especially the divide and conquer one) is likely to be more helpful than a local/greedy strategy. – kcsquared Mar 07 '22 at 11:57

0 Answers0