2

This is a problem from spoj. AIBOHP

The characters can be added anywhere in the string.
The simplest solution that I read stated as follows:
Find the Longest Common Subsequence of the string and it's reverse. The answer then is (length of the string - LCS of both strings). This kind of seems intuitive but I'm having a hard time proving it. I'm sorry if the question seems stupid but can someone clearly explain why this approach would always work.

Note: Questions related to this problem have been asked sometimes on Stack Overflow, but none of those seem to answer my question. This question is not duplicate, In this question an answer states this approach but nowhere is the intuition or the proof given and that's exactly what I am asking. I know how to solve it, I'm asking why this method works.

Community
  • 1
  • 1
Akshay Arora
  • 729
  • 1
  • 8
  • 20
  • Can you clarify whether LCS refers to "longest common substring" or "longest common subsequence" and whether "added to" means at the end only, at the beginning or the end, or anywhere in the string? I understand this is probably stated at the linked destination but not everybody wants to follow links. – Patrick87 Dec 14 '15 at 19:56
  • Prove by induction. Start with `length(s)-length(lcs(s,reverse(s)))==0`. Then see what happens when you add a character according to the optimal algorithm. How does `length(s)` change? How does `length(lcs(s,reverse(s)))` change? – n. m. could be an AI Dec 14 '15 at 20:43
  • If by _longest common subsequence_ you mean a subsequence of _contiguous_ characters, your solution is wrong. E.g.: `aaaaaxaabaaa` – Walter Tross Dec 14 '15 at 21:01
  • Possible duplicate of [How can I compute the number of characters required to turn a string into a palindrome?](http://stackoverflow.com/questions/2237021/how-can-i-compute-the-number-of-characters-required-to-turn-a-string-into-a-pali) – Sled Dec 14 '15 at 21:30

2 Answers2

2

Let's realize that your LCS is nothing else than the longest sub-palindrome of your original input (doesn't have to be continuous). If x is the number of letters not contained in the longest sub-palindrome, you can easily add 2x to that sub-palindrome to make it contain your original input and stay a palindrome.

Consider the word berries written as bERRiEs, capital letters marking the longest sub-palindrome, ERRE. Go from the middle, adding 2 letters for each small-case letter in original word: ERRE -> EiRRiE -> bEiRRiEb -> sbEiRRiEbs and there you go.

On the other hand, this cannot be improved. Consider you have added y characters to your input to make it a palindrome. Remove 2y characters (those added and their mirrors) to get a sub-palindrome of your input sequence - this clearly cannot be larger than the longest sub-palindrome, so you have added at least the amount of missing characters in the longest sub-palindrome.

Akshay Arora
  • 729
  • 1
  • 8
  • 20
kajacx
  • 12,361
  • 5
  • 43
  • 70
1

You can use the following algorithm to change a string into a palindrome using n insertions, where n is the difference between the string length and the length of its LCS with its reverse:

  1. Find the LCS of the string with its reverse. Put the characters of the LCS into a stack lcs
  2. Set palindrome to the original string
  3. Set s to the empty string. Set i to 0.
  4. Walk along reverse string. For each character c in the reverse string:
    1. If top of lcs matches c:
      1. Pop lcs
      2. Insert s into palindrome after the ith character
      3. Set s to the empty string
      4. Set i to the location in the original string of c.
    2. Else, append c to s

Since a character in the reverse string is only added to s, and subsequently into the final palindrome, if it is not in the LCS, the total number of inserted characters is the length of the reverse string minus the length of the LCS.

Here is an example. Given the string thisisatest, the reverse is the string tsetasisiht. The LCS is t-isi-t.

      t h  isi sates t
^ tse t as isi h     t
      _    ___       _
=     t    isi       t

The "missing" subsequences of the reverse string are tse, as, and h:

  tse t as isi h t
-     t    isi   t
  ___   __     _
= tse   as     h

Note that the total number of characters in these missing subsequences is the length of the original string minus the length of the LCS, demonstrating the claim you're interested in.

Inserting tse after the beginning of the string, as after t and h after isi, like so:

      t   h  isi  sates t
+ tse   as       h      
  ___ _ ___  ___ ______ _
= tse t ash  isi hsates t

We obtain the palindrome tsetashisihsatest.

Asad Saeeduddin
  • 46,193
  • 6
  • 90
  • 139