2

Hey i have a question about optimization palindromes count algorithm

Task: Find count of palindromes in string.

in my func i use "in the forehead" method its like O(n^2) can you guys help make it in O(n) or O(nlogn)

func isPalindrome(string: String) -> Bool {
    let str = (string.lowercased())
    let strWithoutSpace = str.components(separatedBy: .whitespaces).joined(separator: "")
    let strArray = Array(strWithoutSpace.characters)
    var i = 0
    var j = strArray.count-1
    while i <= j {
        if strArray[i] != strArray[j] { return false }
        i+=1
        j-=1
    }
    return true
}
func palindromsInString(string: String) -> Int {
    var arrayOfChars = Array(string.characters)
    var count = 0
    for i in 0..<arrayOfChars.count-1 {
        for x in i+1..<arrayOfChars.count {
            if isPalindrome(string: String(arrayOfChars[i...x])) {
                count+=1
            }
        }
    }
    return count
}

and yes in my instance one letter can't be a palindrome

DmitrievichR
  • 119
  • 1
  • 11

3 Answers3

2

You can use Manacher's algorithm to solve it in linear time. This algorithm is usually used for finding the longest palindrome, but it computes the maximum length of the palindrome that has a center in a specific position for each position in the string.

You can find the description and the implementation of this algorithm in this question.

Community
  • 1
  • 1
kraskevich
  • 18,368
  • 4
  • 33
  • 45
  • hey thanks for answer i try solve this problem use Manacher's algorithm before . And i've many problems with it. Maybe there is some algorithm is easier to understand for beginning thanks once again ;) – DmitrievichR Oct 20 '16 at 16:42
  • 1
    @DmitrievichR I'm not aware of any really simple algorithm that is faster than `O(n ^ 2)` for this problem. – kraskevich Oct 20 '16 at 16:43
  • @kraskevich and original author, see my answer for what I believe is a O(n) solution, or approaching O(n). – Tim Fuqua Oct 21 '16 at 05:26
2

I'm not familiar with Manacher's algorithm, but I've always enjoyed figuring out efficient algorithms, so I thought I'd have a stab at this.

Your algorithm for determining if a string is a palindrome looks like the kinds I've come up with, so I decided to just use your isPalindrome function, though I changed it to be a function in an extension of String instead, and I removed the whitespace removal logic, as I felt that needed to be in the invoking call rather than in the palindrome determining function itself.

extension String {
    func isPalindrome() -> Bool {
        if length < 2 { return false }
        let str = lowercased()
        let strArray = Array(str.characters)
        var i = 0
        var j = strArray.count-1
        while i <= j {
            if strArray[i] != strArray[j] { return false }
            i+=1
            j-=1
        }
        return true
    }
}

After looking at your palindromsInString solution, it looks like a standard brute force, but simple and readable solution.

My first thought for a different algorithm was also pretty brute force, but was a totally different approach, so I'm calling it the Naive solution.

The idea of the Naive solution is to create arrays of substrings of the original string and check if each substring is a palindrome or not. The way I determine the substrings is to start with the biggest substring possible (the original string) and then get the 2 substrings of length string.length-1, and then string.length-2 and so on until lastly I get all the substrings of length 2 (I'm ignoring substrings of length 1 because you said a string of length 1 can't be a palindrome).

ie: substrings of "test" greater than length 1 would be:

["test"] ["tes", "est"] ["te", "es", "st"]

So you just loop through each of those arrays and check if any are palindromes, and increasing the count if they are:

Naive Solution:

extension String {
    var length: Int { return characters.count }

    func substringsOfLength(length: Int) -> [String] {
        if self.length == 0 || length > self.length { return [] }

        let differenceInLengths = self.length - length

        var firstIndex = startIndex
        var lastIndex = index(startIndex, offsetBy: length)
        var substrings: [String] = []

        for _ in 0..<differenceInLengths {
            substrings.append(substring(with: Range(uncheckedBounds: (firstIndex, lastIndex))))
            firstIndex = index(after: firstIndex)
            lastIndex = index(after: lastIndex)
        }
        substrings.append(substring(with: Range(uncheckedBounds: (firstIndex, lastIndex))))

        return substrings
    }
}

extension String {
    func containsAPalindromeNaive(ignoringWhitespace: Bool = true) -> Int {
        let numChars = length

        if numChars < 2 { return 0 }

        let stringToCheck = (ignoringWhitespace ? self.components(separatedBy: .whitespaces).joined(separator: "") : self).lowercased()
        var outerLoop = numChars
        var count: Int = 0

        while outerLoop > 0 {
            let substrings = stringToCheck.substringsOfLength(length: outerLoop)
            for substring in substrings {
                if substring.isPalindrome() {
                    count += 1
                }
            }
            outerLoop -= 1
        }

        return count
    }
}

I knew full well that this algorithm would be slow, but I wanted to implement it as a second baseline for my real solution.

I call this solution the Smart solution. It is a multipass solution that takes advantage of the number of, and positions of, the characters within a string.

In the first pass, I generate what I call a character mapping. The character mapping is a dictionary that maps a Character to an array of indices. So you go over the string and add each character's index to an array stored under it's character value as the key.

The idea is that palindromes can only exist within a string that is bookended by the same letter. Therefore, you only need to check the substrings within a string at the indices of a particular letter. In the word "tattoo", you have 3 distinct letters: "t", "a", "o". The character mappings would look like this:

t: [0,2,3]
a: [1]
o: [4,5]

I now know that palindromes can only exist in this word between (0,2), (2,3), and (4,5). So I only need to check 3 substrings(0,2), (0,3), (2,3), and (4,5). So I only need to check 4 substrings. That's the idea. And once you've checked out all possible substrings bookended by a particular letter, you can ignore any other substrings you come across that start with that letter, because you've already checked them.

In the second pass, I go through each character in the string and if I haven't already checked that letter, I go through the pairs of permutation indices generated by generateOrderedPairIndexPermutations for the indices in the character mapping and check the substrings to see if they are a palindrome. I then do 2 optimizations here. First, if the distance between the starting character index and the ending character index is less 3, it must be a palindrome (a distance of 1 means they are sequential, and a distance of 2 means there's a single letter in between them, thus also guaranteed to be a palindrome). Second, because I already know that the first and last characters are the same, I don't need to check the whole substring, just from the second letter up until the second to last letter. So if the substring would be "test", and I'm always guaranteeing myself that the substring is bookended by the same letter, I don't actually need to check "test", and I can instead just check "es". It's a smaller optimization, but a nice one nonetheless.

Smart Solution:

extension Collection {
    func generateOrderedPairIndexPermutations() -> [(Index,Index)] {
        if count < 2 {
            return []
        }

        var perms: [(Index,Index)] = []

        var firstIndex = startIndex

        while firstIndex != endIndex {
            var secondIndex = index(firstIndex, offsetBy: 1)
            while secondIndex != endIndex {
                perms.append((firstIndex,secondIndex))
                secondIndex = index(secondIndex, offsetBy: 1)
            }
            firstIndex = index(firstIndex, offsetBy: 1)
        }

        return perms
    }
}

extension String {
    func generateCharacterMapping() -> [Character : [Int]] {
        var characterMapping: [Character : [Int]] = [:]

        for (index, char) in characters.enumerated() {
            if let indicesOfChar = characterMapping[char] {
                characterMapping[char] = indicesOfChar + [index]
            } else {
                characterMapping[char] = [index]
            }
        }

        return characterMapping
    }

    func containsAPalindromeSmart(ignoringWhitespace: Bool = true) -> Int {
        let numChars = length

        if numChars < 2 { return 0 }

        let stringToCheck = (ignoringWhitespace ? self.components(separatedBy: .whitespaces).joined(separator: "") : self).lowercased()
        let characterMapping = stringToCheck.generateCharacterMapping()

        var count: Int = 0
        var checkedChars: Set<Character> = Set()

        for char in stringToCheck.characters {
            if checkedChars.contains(char) == false {
                if let characterIndices = characterMapping[char], characterIndices.count > 1 {
                    let perms = characterIndices.generateOrderedPairIndexPermutations()
                    for (i,j) in perms {
                        let startCharIndex = characterIndices[i]
                        let endCharIndex = characterIndices[j]

                        if endCharIndex - startCharIndex < 3 {
                            count += 1
                        } else {
                            let substring = stringToCheck.substring(with: Range(uncheckedBounds: (stringToCheck.index(stringToCheck.startIndex, offsetBy: startCharIndex+1), stringToCheck.index(stringToCheck.startIndex, offsetBy: endCharIndex))))
                            if substring.isPalindrome() {
                                count += 1
                            }
                        }
                    }
                    checkedChars.insert(char)
                }
            }
        }

        return count
    }
}

I felt pretty good about this solution. But I had no clue just how fast it really was.It was really fast

Using XCTest to measure performance, I ran each algorithm through some performance tests. Using this string as the base source: "There are multiple palindromes in here""Was it a car or a cat I saw", updated based on suggestions to use a more rigorous input string, which is 3319 characters long when whitespace is removed and it is lowercased ("therearemultiplepalindromesinhere""wasitacaroracatisaw"), I also created copies that were this string times 2 ("therearemultiplepalindromesinheretherearemultiplepalindromesinhere"wasitacaroracatisawwasitacaroracatisaw), times 4, times 8, and times 10. Since we're trying to determine the O() of the algorithm, scaling the number of letters up and measuring the scale factor is the way to go.

In order to get more accurate data, I ran each test through 10 iterations (I would've liked more iterations, but the original solution and my Naive solution both don't finish in any timely manner on the tests above times 4). Here's the timings data I collected (screenshot of the spreadsheet was easier than entering it again here):

UPDATED Timings Table

UPDATED Green is Author; Red is Naive Solution; Orange is Smart Solution Timings Graph

As you can see, both your original solution and my Naive Solution operate in quadratic time (your solution has a quadratic correlation coefficient of r=0.9997, and my Naive Solution has a coefficient of r=0.9999; so, very clearly quadratic!). My Naive solution seems to be under your solution, but they are both increasing quadratically, and are therefore O(n^2), as we already knew.

The interesting part about my Smart Solution is it seems linear! My small, 5-point data set through a regression calculator, and it has a correlation coefficient of r=0.9917! So if it isn't linear, it's so close that I wouldn't care.

All of the solutions operate in quadratic time now. Le sigh. But at least the bug was discovered, addressed, and science prevailed this day. Bummer that my "Smart Solution" didn't actually end up linear. However, I will note that if the input string isn't already a giant palindrome (like the one I ended up changing it to be), then the "Smart Solution"'s optimizations make it perform much faster, albeit still in quadratic time.

I don't know if my algorithm is any easier to understand than the Manacher's algorithm, but I hope I explained it well. And the results are pretty promising, so I hope you find a good use out of it. This is actually still true. I think it's a promising algorithm. Perhaps my code for generateOrderedPairIndexPermutations isn't the best.

Updated to address the bug found by kraskevich

Tim Fuqua
  • 1,635
  • 1
  • 16
  • 25
  • Your smart solution seems to implement pretty nice optimizations through the look up table and could really be a viable one once it takes care of white spaces too but i still don't think it can be O(n) since the amount of work that it will do will definitely change depending on the input string. For instance it it has a long palindrome like "Was it a car or a cat I saw" among a long sentence it has a lot more to do i suppose. It might be a linear time complexity like O(xn) but i suppose not O(n). Still very clever though. – Redu Oct 21 '16 at 05:57
  • What does this solution return for a string consisting of multiple repetitions of the same letter? If I got it right, it would return 2 for the string "aaa", but it should be 3. – kraskevich Oct 21 '16 at 07:18
  • @kraskevich in task "and yes in my instance one letter can't be a palindrome" – DmitrievichR Oct 21 '16 at 09:16
  • 1
    I know that it's not. aa, aa, aaa are all palindromes. If this solution counts all distinct palindromes, it's not correct either (try, for example, "aabcaa" testcase). – kraskevich Oct 21 '16 at 11:45
  • @kraskevich you are absolutely correct. There is a bug. Rather than only examining *successive* pairs of indices, I should check all pairs. I'll adjust it some time today, and update the answer. I can't imagine it slows down too much though. – Tim Fuqua Oct 21 '16 at 14:12
0

Here's a "functional programming" solution that is much less impacted by the exponential nature of the process than the accepted answer. (also a lot less code)

It is 80% times faster on short (19) strings and 90 times faster on the longer ones (190). I haven't formally proved it but it seems linear O(n)?.

public func countPalindromes(in text:String) -> Int
{
   let words  = text.lowercased()
                    .components(separatedBy:CharacterSet.letters.inverted)
                    .filter{!$0.isEmpty}
                    .joined(separator:"") 

   let sdrow  = String(words.characters.reversed())

   let splits = zip( sdrow.characters.indices.dropFirst().reversed(),
                     words.characters.indices.dropFirst() 
                   )
                .map{ (sdrow.substring(from:$0),words.substring(from:$1), words[$1...$1] ) }

   let count  = splits.map{$0.1.commonPrefix(with:$0.0)}  // even
                      .filter{ !$0.isEmpty }
                      .reduce(0){$0 + $1.characters.count}
              + splits.map{ $1.commonPrefix(with:$2 + $0)} // odd
                      .filter{$0.characters.count > 1 }
                      .reduce(0){$0 + $1.characters.count - 1}
   return count
}

// how it works ...

// words   contains the stripped down text (with only letters)
//
// sdrow   is a reversed version of words
//
// splits  creates split pairs for each character in the string.
//         Each tuple in the array contains a reversed left part, a right part
//         and the splitting character
//         The right part includes the splitting character 
//         but the left part does not.
//
//         [(String,String,String)] for [(left, right, splitChar)]
//
//         The sdrow string produces the left part in reversed letter order .
//         This "mirrored" left part will have a common prefix with the
//         right part if the split character's position is in the middle (+1)
//         of a palindrome that has an even number of characters
//
//         For palindromes with an odd number of characters, 
//         the reversed left part needs to add the splitting character
//         to match its common prefix with the right part.
//
// count   computes the total of odd and even palindromes from the
//         size of common prefixes. Each of the common prefix can produce
//         as many palindromes as its length (minus one for the odd ones)

[EDIT] I also made a procedural version for comparison purposes, knowing that the compiler can better optimize procedural code than its declarative counterpart.

It is an extension of the Array type (so it can count palindromes of anything comparable).

extension Array where Element:Comparable
{
   public func countPalindromes() -> Int
   {
      if count < 2 { return 0 }

      var result = 0

      for splitIndex in (1..<count)
      {
         var leftIndex      = splitIndex - 1
         var rightIndex     = splitIndex
         var oddPalindrome  = true
         var evenPalindrome = true
         while leftIndex >= 0 && rightIndex < count
         {
            if evenPalindrome  
            && self[leftIndex] == self[rightIndex]
            { result += 1 }
            else
            { evenPalindrome = false }

            if oddPalindrome   
            && rightIndex < count - 1
            && self[leftIndex] == self[rightIndex+1]
            { result += 1 }
            else
            { oddPalindrome = false }

            guard oddPalindrome || evenPalindrome
            else { break }

            leftIndex  -= 1
            rightIndex += 1
         }
      }      
      return result
   }

} 

public func countPalindromesFromArray(in text:String) -> Int
{
   let words  = text.lowercased()
                    .components(separatedBy:CharacterSet.letters.inverted)
                    .filter{!$0.isEmpty}
                    .joined(separator:"") 
   return Array(words.characters).countPalindromes()
}

It performs 5 to 13 times faster than the declarative one and up to 1200 times faster than the accepted answer.

The increasing performance difference with the declarative solution tells me that it was not O(n). The procedural version may be O(n) because its time will vary with the number of palindromes but will be proportional to the size of the array.

Alain T.
  • 40,517
  • 4
  • 31
  • 51