0

I'm trying to make a python script, that gets me the longest repeated character in a given matrix (horizontally and vertically).

Example:

I have this matrix:

afaaf
rbaca
rlaff

Giving this matrix for input, it should result: a 3

You can see that that the 3rd column of the matrix, is full of a's and also, it's the most repeated character in the matrix.

What I have:

#!/bin/python2.7

#Longest string in matrix

#Given a matrix filled with letters. Find the longest string, containing only the same letter, which can be obtained by starting
#with any position and then moving horizontally and vertically (each cell can be visited no more than 1 time).


# Settings here
# -------------
string_matrix = """
afaaf
rbaca
rlaff
"""

pos = (0,0)
# -------------

import pdb
import time
import collections
from collections import defaultdict
import re


rows = 0
columns = 0
matrix = []
matrix2 = []
counter = 0
res_l = []
i = 0
c = ''

# if matrix2 is full of 1's, stop
def stop():
    for i in range(0, rows):
        for j in range(0, columns):
            if matrix2[i][j] == 0:
                return False
    return True

# checks the points, and returns the most repeated char and length
def check_points(points1, points2):
    r = []

    r.append(-1)
    r.append('')

    # create strings from matrix
    s1 = ''
    s2 = ''


    for point in points1:
        s1 += matrix[point[0]][point[1]]

    for point in points2:
        s2 += matrix[point[0]][point[1]]

    rr = {}

    for c in s1:
        rr[c] = 0

    for c in s2:
        rr[c] = 0

    for i in range(0, len(s1)):
        k = 1
        for j in range(i+1, len(s1)):
            if s1[i] == s1[j]:
                k += 1
            else:
                break
            if k > rr[s1[i]]:
                rr[s1[i]] = k


    for i in range(0, len(s2)):
        k = 1
        for j in range(i+1, len(s2)):
            if s2[i] == s2[j]:
                k += 1
            else:
                break
            if k > rr[s2[i]]:
                rr[s2[i]] = k


    m = -1
    c = ''
    for key,value in rr.iteritems():
        if value > m:
            m = value
            c = key

    return m, c


# Depth-first search, recursive
def search(pos):
    global res_l
    global matrix2
    global c

    counter = 0

    x = pos[0]
    y = pos[1]

    c = matrix[x][y]

    # base clause
    # when matrix2 is all checked
    if stop():
        return counter, c



    points1 = []
    points2 = []
    allpoints = []

    for i in range(0, columns):
            if matrix2[x][i] != 1:
                points1.append([x, i])
                allpoints.append([x, i])


    for i in range(0, rows):
            if matrix2[i][x] != 1:
                points2.append([i, x])
                allpoints.append([i, x])





    r = check_points(points1, points2)


    if r[0] > counter:
        counter = r[0]
        c = r[1]

    matrix2[x][y] = 1


    for point in allpoints:
        rr = search(point)
        if rr[0] > counter:
            counter = int(rr[0])
            c = rr[1]
            #print 'c: ' + str(c) + ' - k: ' + str(counter)



    return counter, c

def main():
    # create the matrix from string
    string_matrix_l = string_matrix.strip()
    splited = string_matrix_l.split('\n')


    global rows
    global columns
    global matrix
    global matrix2

    rows = len(splited)
    columns = len(splited[1])



    # initialize matrixes with 0
    matrix = [[0 for x in range(columns)] for x in range(rows)]
    matrix2 = [[0 for x in range(columns)] for x in range(rows)]


    # string to matrix
    i = 0
    for s in splited:
        s = s.strip()
        if s == '':
            continue

        j = 0

        for c in s:
            try:## Heading ##
                matrix[i][j] = c
                #print 'ok: ' + str(i) + ' ' + str(j) + ' ' +  c
            except:
                print 'fail: index out of range matrix[' + str(i) + '][' + str(j)+'] ' + c
            j = j + 1

        i = i + 1


    # print some info
    print 'Given matrix: ' + str(matrix) + '\n'
    print 'Start position: ' + str(pos)
    print 'Start character: ' + str(matrix[pos[0]][pos[1]])

    # get the result
    res = search(pos)
    print '-------------------------------------'
    print '\nChar: ' + str(res[1]) + '\nLength: ' + str(res[0])


if __name__ == "__main__":
    main()

This is my source code. The example given above, is also used in the source code. The result given is: r 2 which is wrong ... again, should be a 3

It has 4 functions: main, search, stop and check_points.

  • main is initializing things up,
  • search is my recursive function that takes one parameter (the start point), and should recursively check for the longest string. I have another matrix, same length as original, which is just 1 and 0. 1 means the position was visited, 0, not. The search function is setting 1 on the right position after a certain position was processed by the search function.
  • stop is checking if matrix2 is full of 1's, in this case, the matrix was all parsed
  • check_points takes 2 parameters, 2 list of points, and returns the most repeated character and it's length for those points

What doesn't work:

Most of the time is giving me the wrong character as result, even thought the count might be right sometimes. Sometimes it's working on horizontally, sometimes it doesn't. I am sure that I'm doing something wrong, but ... it's over 1 week now since I'm trying to figure out how to do this. Asked another question here on stackoverflow, got bit further but ... still stuck.

Any suggestion is appreciated.

user1812076
  • 269
  • 5
  • 21
  • The description seems to allow us to turn while following a chain of letters. Is this allowed? If so, there's a chain of 4 a's. – user2357112 Apr 08 '15 at 08:42
  • no, I don't need it that complex. Just vertically and horizontally, it can't go both ways at the same time. – user1812076 Apr 08 '15 at 08:45
  • Hmm, your approach is far from correct, so I suggest you try to dry-run your code with reasonable-size example, to see that what's gone wrong and what should be expected. For this problem, simplest approach is brute-force all rows and columns, which takes O(n) per row/col, which is not bad at all. – Pham Trung Apr 08 '15 at 08:45
  • I'm trying to get this one done for over a week ... I tried every approach possible I knew of. – user1812076 Apr 08 '15 at 08:51
  • Some hints: you need to take care of **direction** problem, so your search now just goes into any direction, but this is not correct. Second thing is, you should limit the direction into going right and going down only. Third thing is, you can have a for loop to check every `unchecked` position, don't try to pack them into one recursive call in main, this will complicated your life. And If you still cannot solve it, go back to the simple version of this problem, *if you have a single line String, how can you solve this problem*. – Pham Trung Apr 08 '15 at 08:57
  • well, like in the check_points function, I would use a dictionary, add each character from the string to it (key), and depending on how many times it occures, increase it's value. – user1812076 Apr 08 '15 at 09:04
  • Even that's is wrong too, that is not how you check longest repeated path. What happen if the string is `ababa` ? result is one, but your approach give 3 – Pham Trung Apr 08 '15 at 09:06
  • how should I check it then ? – user1812076 Apr 08 '15 at 09:07
  • If you cannot solve that simple version, how can you solve this problem? Think more my friend, why you need to solve this problem? An assignment or you want to be better? – Pham Trung Apr 08 '15 at 09:08
  • @PhamTrung if you would have checked the check_string function, you would have known how I did it. I forgot to mention that the counting will reset for all the keys, if the char (key) is not the same as the previous one. And yes, it's an assignment, but what's wrong with it ? I ain't trying to cheat, just figure out what to do in order to solve the problem. – user1812076 Apr 08 '15 at 09:13
  • If you can do it with one line, why can't you just do it with all separated row and col? Hmm, I think it is the exactly the same thing Kolmar suggested to you. No, I don't criticize you, but just want you to figure out how to do it yourself, you are very near actually after all our effort. But well,... – Pham Trung Apr 08 '15 at 09:21
  • Yep, I have an idea of how to do it, but, the thing is that I forgot to mention I need it to be recursively as well, I don't know if Kolmar's answer is recursive really, but it's definetely the answer to my question. – user1812076 Apr 08 '15 at 09:24

2 Answers2

2

You can use itertools.groupby to quickly find the count of repetitions of some character, and izip_longest(*matrix) to transpose the matrix (swap its rows and columns).

from itertools import groupby, izip_longest

matrix_string = """
afaaf
rbaca
rlaff
"""

def longest_repetition(row): 
    return max((sum(1 for item in group), letter) 
               for letter, group in groupby(row) 
               if letter is not None)

def main():
    matrix = [[letter for letter in row.strip()] 
              for row in matrix_string.strip().split('\n')]

    count, letter = max(
        max(longest_repetition(row) for row in matrix),
        max(longest_repetition(col) for col in izip_longest(*matrix))
    )

    print letter, count

if __name__ == '__main__':
    main()

Since you've updated the requirement here is a recursive version of the code with some explanations. If it were not an assignment and this task came up in some real life problem, you should really have used the first version.

matrix_string = """
afaaf
rbaca
rlaff
"""


def find_longest_repetition(matrix):
    rows = len(matrix)
    cols = len(matrix[0])

    # row, col - row and column of the current character.
    # direction - 'h' if we are searching for repetitions in horizontal direction (i.e., in a row).
    #             'v' if we are searching in vertical direction.
    # result - (count, letter) of the longest repetition we have seen by now.
    #          This order allows to compare results directly and use `max` to get the better one
    # current - (count, letter) of the repetition we have seen just before the current character.
    def recurse(row, col, direction, result, current=(0, None)):
        # Check if we need to start a new row, new column, 
        #   new direction, or finish the recursion.
        if direction == 'h':    # If we are moving horizontally
            if row >= rows:     # ...  and finished all rows
                return recurse( # restart from the (0, 0) position in vertical direction.
                    0, 0,
                    'v',
                    result
                )
            if col >= cols:     # ... and finished all columns in the current row
                return recurse( # start the next row.
                    row + 1, 0,
                    direction,
                    result
                )
        else:                   # If we are moving vertically
            if col >= cols:     # ... and finished all columns
                return result   # then we have analysed all possible repetitions. 
            if row >= rows:     # ... and finished all rows in the current column
                return recurse( # start the next column.
                    0, col + 1,
                    direction,
                    result
                )

        # Figure out where to go next in the current direction
        d_row, d_col = (0, 1) if direction == 'h' else (1, 0)

        # Try to add current character to the current repetition
        count, letter = current
        if matrix[row][col] == letter:
            updated_current = count + 1, letter
        else:
            updated_current = 1, matrix[row][col]

        # Go on with the next character in the current direction
        return recurse( 
            row + d_row, 
            col + d_col,
            direction,
            max(updated_current, result), # Update the result, if necessary
            updated_current
        )

    return recurse(0, 0, 'h', (0, None))


def main():
    matrix = [[letter for letter in row.strip()] 
              for row in matrix_string.strip().split('\n')]

    count, letter = find_longest_repetition(matrix)

    print letter, count


if __name__ == '__main__':
    main()
Kolmar
  • 14,086
  • 1
  • 22
  • 25
  • wow, it works ! but except from this, can you explain a bit more what's going on ? – user1812076 Apr 08 '15 at 09:07
  • @user1812076 Sure, I'll do that in a few hours. (I have an appointment starting now) In the meantime you could play with `groupby` and see how it works yourself. – Kolmar Apr 08 '15 at 09:09
  • @user1812076 I've added the recursive version of the code. – Kolmar Apr 08 '15 at 12:43
1

You can also try the collections.Counter(string).most_common() to get the most repetitions of a character.

from collections import Counter

string_matrix = """
afaaf
rbaca
rlaff
"""

def GetMostRepetitions(pos):
    mc = []

    for ii in range(pos[0],len(working_mat)):
        mc.extend(Counter(working_mat[ii]).most_common(1))  

        for jj in range(pos[1],len(working_mat[0])):
            column = []           

            for kk in range(ii,len(working_mat)):
                column.append(working_mat[kk][jj])

            mc.extend(Counter(column).most_common(1))          

    m = 0           
    for item in mc: 
        if item[1] > m:
            m = item[1]
            k = item[0]


    print(k, m)                 

working_mat = string_matrix.strip().split('\n')

for ii in range(len(working_mat)):
    for jj in range(len(working_mat[0])):
        pos = (ii,jj)
        GetMostRepetitions(pos)

As Kolmar said, you can also use a better way to transpose the matrix.

Community
  • 1
  • 1
Repiklis
  • 399
  • 6
  • 20
  • thank you for the solution, this works as well! Any way I could change it to work with a function recursively ? – user1812076 Apr 08 '15 at 09:25
  • @user1812076 Updated to work over the range of positions – Repiklis Apr 08 '15 at 09:35
  • thank you, I see it's using a function now, but ... I don't think it's recursively, is it ? I'm really appreciating the answer, and it's really what I wanted! But, if I'm not asking too much, I will need it recursively, and I think the function should call itself. – user1812076 Apr 08 '15 at 09:36
  • @user1812076 No. It's not a recursive function. Why do you want to do it this way? – Repiklis Apr 08 '15 at 09:46
  • Well, that's what I'm trying to achieve basically, but forgot to mention in the original question. Like I said, it's cool in this way as well, but if you know the answer, and also explain just in few lines, I would really appreciate it. – user1812076 Apr 08 '15 at 10:34
  • @user1812076 You are still not explaining why you need a recursive function. I can't help you if you don't give more details. – Repiklis Apr 08 '15 at 10:57
  • well, the reason is that I have to make it with recursion, it's an assignment, but, just so you don't get it wrong, I'm not trying to cheat or copy solutions, I'm trying to solve the problem and also understand what's going on. – user1812076 Apr 08 '15 at 11:10