2

I can't solve this exercise, I have an NxN matrix like the following:

NxN

I have to develop a function, it must receive by parameter an array like the following

dna = array ("ATGCGA", "CAGTGC", "TTATGT", "AGAAGG", "CCCCTA", "TCACTG")

I have to find a way to run through it and at the same time find all possible matches of 4 equal and consecutive letters like the image of the matrix that I have posted in the question.

So far I managed to develop this function that only verifies the sequence of the four equal letters in this case is C, with which I print echo "equal "; with this I get a horizontal search. How could I detect the oblique and vertical sequence of the matrix?

Here my code attempt

<?php
$dna = array(
    array("A", "T", "G", "C", "G", "A"),
    array("C", "A", "G", "T", "G", "C"),
    array("T", "T", "A", "T", "G", "T"),
    array("A", "G", "A", "A", "G", "G"),
    array("C", "C", "C", "C", "T", "A"),
    array("T", "C", "A", "C", "T", "G"),
);
function check($dna)
{
    $fu = 0;
    foreach ($dna as $x) {
        for ($i = 0, $j = 1; $i < count($x); $i++, $j++) {
            if ($j < count($x)) {
                if ($x[$i] == $x[$j])
                    $fu++;
            }
        }

        if ($fu >= 4) {
            echo "equal<br>";
            $fu = 0;
        } else
            echo "different<br>";
    }
}
check($dna);

Please can someone help me!!

Rodrigo Ruiz
  • 167
  • 4
  • 10
  • Is the matrix always 6x6 or can it be bigger, like NxN? – yuko Jul 29 '21 at 21:19
  • Is it possible that two or more sequences of at least 4 equal consecutive letters can be found in a line (horizontal, vertical, oblique). For example: "ATTTTCGGGGGT". Because that could be the case if the matrix is bigger than 6x6. – yuko Jul 29 '21 at 21:22
  • @yuko Thanks for answering. **For this case I would like to solve the 6x6 matrix to take the example of the image that I publish in the question.** On the other hand, it would be good to contemplate what you propose, I have not thought about it that way. **But the function should be able to adapt to any matrix by traversing it and finding the sequences of 4 equal and consecutive letters that exist.** – Rodrigo Ruiz Jul 30 '21 at 01:37
  • 1
    Thanks for the reply. What happens if there is a sequence of 5 or more, for example 6 equal and consecutive letters "CCCCCC" in a line. Is this counted as 1 sequence or 3 sequences (the first 4 C's, the middle 4 C's, the last 4 C's). I'm working on a solution and I want to be sure the answer meets your needs. – yuko Jul 30 '21 at 07:14
  • @Yuko **The array for this case will only be string. therefore the function will only receive string the numbers are not contemplated.** Thank you for your point of view and your interest in helping me. – Rodrigo Ruiz Jul 30 '21 at 11:25

1 Answers1

1

This solution works for NxN matrix of all sizes. It returns the amount of sequences that consists of at least $consecutive_length of the same letters consecutively.

It consists of 3 parts: 1) get_count_sequences_horizontal 2) get_count_sequences_vertical and 3) get_count_sequences_oblique.

In the end it is summed up in get_count_sequences.

Here's the code. Explanation is in the comments.

<?php

    function get_count_sequences_horizontal($input_dna, $consecutive_length) {
        
        $count_sequences_horizontal = 0;
        
        //ITERATES THROUGH EACH ROW
        for($y = 0; $y < count($input_dna); $y++) {
            
            $count_consecutive_letter = 0;
            
            //ITERATES THROUGH EACH COLUMN
            for($x = 0; $x < count($input_dna[0]) - 1; $x++) {
                
                //COMPARES THE CURRENT VALUE WITH THE VALUE OF THE NEXT HORIZONTAL POSITION
                if($input_dna[$y][$x] == $input_dna[$y][$x + 1]) {
                    $count_consecutive_letter++;
                } else {
                    //IF THE COMPARISON FAILS, THE COUNTER IS RESET, BECAUSE IN A LARGE MATRIX THERE MAY BE MULTIPLE SEQUENCES OF AT LEAST 4 CONSECUTIVE LETTERS
                    $count_consecutive_letter = 0;
                }
                
                //IF THE COUNT_CONSECUTIVE_LETTER HAS REACHED X, THAT MEANS THAT (X + 1) LETTERS ARE CONSECUTIVE
                //WHY? CCCC IS 4 LETTER SEQUENCE. C1 == C2 is 1 count, C2 == C3 is 2nd count, C3 == C4 is 3rd count. 3 counts of comparison of 4 C's
                if($count_consecutive_letter == $consecutive_length - 1) {
                    $count_sequences_horizontal++;
                }
                
            }
            
        }
        
        return($count_sequences_horizontal);

    }

    function get_count_sequences_vertical($input_dna, $consecutive_length) {
        
        $count_sequences_vertical = 0;
        
        //ITERATES THROUGH EACH COLUMN
        for($x = 0; $x < count($input_dna[0]); $x++) {
            
            $count_consecutive_letter = 0;
            
            //ITERATES THROUGH EACH ROW
            for($y = 0; $y < count($input_dna) - 1; $y++) {
                
                //COMPARES THE CURRENT VALUE WITH THE VALUE OF THE NEXT VERTICAL POSITION
                if($input_dna[$y][$x] == $input_dna[$y + 1][$x]) {
                    $count_consecutive_letter++;
                } else {
                    //IF THE COMPARISON FAILS, THE COUNTER IS RESET, BECAUSE IN A LARGE MATRIX THERE MAY BE MULTIPLE SEQUENCES OF AT LEAST 4 CONSECUTIVE LETTERS
                    $count_consecutive_letter = 0;
                }
                
                //IF THE COUNT_CONSECUTIVE_LETTER HAS REACHED X, THAT MEANS THAT (X + 1) LETTERS ARE CONSECUTIVE
                //WHY? CCCC IS 4 LETTER SEQUENCE. C1 == C2 is 1 count, C2 == C3 is 2nd count, C3 == C4 is 3rd count. 3 counts of comparison of 4 C's
                if($count_consecutive_letter == $consecutive_length - 1) {
                    $count_sequences_vertical++;
                }
                
            }
            
        }
        
        return($count_sequences_vertical);
        
    }

    function get_count_sequences_oblique($input_dna, $consecutive_length) {
        
        $count_sequences_oblique = 0;
        
        $count_consecutive_letter = 0;
        
        //ITERATES THROUGH THE MIDDLE OBLIQUE LINE
        //EXAMPLE MATRIX SQUARE:    
        //  1234
        //  5678
        //  9ABC
        //  DEFG
        
        //THIS LOOPS THROUGH [1,6,B,G]
        for($x = 0; $x < count($input_dna[0]) - 1; $x++) {
            
            $y = $x;
            
            //CHECKS $y IS WITHIN BOUNDARIES AND PERFORMS BELOW ACTIONS IF IT IS
            if($y < count($input_dna) - 1) {
                
                //COMPARES THE CURRENT VALUE WITH THE VALUE OF THE NEXT OBLIQUE POSITION
                if($input_dna[$y][$x] == $input_dna[$y + 1][$x + 1]) {
                    $count_consecutive_letter++;
                } else {
                    //IF THE COMPARISON FAILS, THE COUNTER IS RESET, BECAUSE IN A LARGE MATRIX THERE MAY BE MULTIPLE SEQUENCES OF AT LEAST 4 CONSECUTIVE LETTERS
                    $count_consecutive_letter = 0;
                }
                
                //IF THE COUNT_CONSECUTIVE_LETTER HAS REACHED X, THAT MEANS THAT (X + 1) LETTERS ARE CONSECUTIVE
                //WHY? CCCC IS 4 LETTER SEQUENCE. C1 == C2 is 1 count, C2 == C3 is 2nd count, C3 == C4 is 3rd count. 3 counts of comparison of 4 C's
                if($count_consecutive_letter == $consecutive_length - 1) {
                    $count_sequences_oblique++;
                }
                
            }
            
        }
        
        //ITERATES THROUGH THE RIGHT HALF SIDE THE MATRIX SQUARE
        //EXAMPLE MATRIX SQUARE:    
        //  1234
        //  5678
        //  9ABC
        //  DEFG
        
        //THIS LOOPS THROUGH [2,7,C], [3,8] AND [4]
        for($offset_x = 1; $offset_x < count($input_dna[0]); $offset_x++) {
            
            $count_consecutive_letter = 0;
            
            
            for($x = $offset_x; $x < count($input_dna[0]) - 1; $x++) {
                
                $y = $x - $offset_x;
                
                //CHECKS $y IS WITHIN BOUNDARIES AND PERFORMS BELOW ACTIONS IF IT IS
                if($y < count($input_dna) - 1) {
                
                    //COMPARES THE CURRENT VALUE WITH THE VALUE OF THE NEXT OBLIQUE POSITION
                    if($input_dna[$y][$x] == $input_dna[$y + 1][$x + 1]) {
                        $count_consecutive_letter++;
                    } else {
                        //IF THE COMPARISON FAILS, THE COUNTER IS RESET, BECAUSE IN A LARGE MATRIX THERE MAY BE MULTIPLE SEQUENCES OF AT LEAST 4 CONSECUTIVE LETTERS
                        $count_consecutive_letter = 0;
                    }
                    
                    if($count_consecutive_letter == $consecutive_length - 1) {
                        $count_sequences_oblique++;
                    }
                
                }
                
            }
            
        }
        
        //ITERATES THROUGH THE LEFT HALF SIDE THE MATRIX SQUARE
        //EXAMPLE MATRIX SQUARE:    
        //  1234
        //  5678
        //  9ABC
        //  DEFG
        
        //THIS LOOPS THROUGH [5,A,F], [9,E] AND [D]
        for($offset_y = 1; $offset_y < count($input_dna); $offset_y++) {
            
            $count_consecutive_letter = 0;
            
            for($y = $offset_y; $y < count($input_dna) - 1; $y++) {
                
                $x = $y - $offset_y;
                
                //CHECKS $x IS WITHIN BOUNDARIES AND PERFORMS BELOW ACTIONS IF IT IS
                if($x < count($input_dna[0]) - 1) {
                
                    //COMPARES THE CURRENT VALUE WITH THE VALUE OF THE NEXT OBLIQUE POSITION
                    if($input_dna[$y][$x] == $input_dna[$y + 1][$x + 1]) {
                        $count_consecutive_letter++;
                    } else {
                        //IF THE COMPARISON FAILS, THE COUNTER IS RESET, BECAUSE IN A LARGE MATRIX THERE MAY BE MULTIPLE SEQUENCES OF AT LEAST 4 CONSECUTIVE LETTERS
                        $count_consecutive_letter = 0;
                    }

                    if($count_consecutive_letter == $consecutive_length - 1) {
                        $count_sequences_oblique++;
                    }
                
                }
                
            }
            
        }
        
        return($count_sequences_oblique);
        
    }

    function get_count_sequences($input_dna, $consecutive_length) {
        
        $count_sequences = get_count_sequences_horizontal($input_dna, $consecutive_length);
        $count_sequences += get_count_sequences_vertical($input_dna, $consecutive_length);
        $count_sequences += get_count_sequences_oblique($input_dna, $consecutive_length);
        
        return($count_sequences);
        
    }

    $dna = array(
        array("A", "T", "G", "C", "G", "A"),
        array("C", "A", "G", "T", "G", "C"),
        array("T", "T", "A", "T", "G", "T"),
        array("A", "G", "A", "A", "G", "G"),
        array("C", "C", "C", "C", "T", "A"),
        array("T", "C", "A", "C", "T", "G"),
    );
    
    echo(get_count_sequences($dna, 4));
    //echo's "3", because there are 3 sequences of the 4 same letters consecutively
?>
yuko
  • 325
  • 1
  • 8
  • Good point the function to go through any matrix of NxN. Going through the matrix to find oblique sequences is what has cost me. Thanks for your help! – Rodrigo Ruiz Jul 30 '21 at 11:30
  • the example works with this type of array ```$dna = array(array("A", "T", "G", "C", "G", "A"),array("C", "A", "G", "T", "G", "C"));``` when trying to change the way I receive the array the method stops working. If I get an array like the following: ```$dna = array("ATGCGATTBG","CAGTGCTCAG","TTATGTAAGG")``` What setting should I apply? Please could you collaborate on this question. – Rodrigo Ruiz Aug 03 '21 at 11:43
  • by having an array like the following:``` $dna = array("ATGCGATTBG","CAGTGCTCAG","TTATGTAAGG")``` I think I should go through it in another way to separate the letters? – Rodrigo Ruiz Aug 03 '21 at 11:46
  • @Rodrigo Ruiz, take a look at this: https://stackoverflow.com/questions/2170320/php-split-string-into-array-like-explode-with-no-delimiter – yuko Aug 04 '21 at 08:54
  • @Rodrigo Ruiz, use the `str_split` function to split up the letters into an array of letters. You can read about it here also: https://www.w3schools.com/php/func_string_str_split.asp – yuko Aug 04 '21 at 08:56