0

My getCount() method should be returning 2 in this case but it is returning 7. I think the reason why it's counting incorrectly is because it's looping through the for 7 times because that's the length of the string. However, I simply just want to scan the string for the pattern and increment patternCount by 1 each time the pattern occurs in the string I'm scanning. Here's my code:

package a2;

public class DNAStrandAdept {

    private String strand;
    private String pattern;
    private String passedStrand;
    private int ACount;
    private int CCount;
    private int GCount;
    private int TCount;
    private int patternCount = 0;

    public static void main(String[] args) {
        DNAStrandAdept test = new DNAStrandAdept("AGGTTGG");
        System.out.println("A count: " + test.getACount());
        System.out.println("C count: " + test.getCCount());
        System.out.println("G count: " + test.getGCount());
        System.out.println("T count: " + test.getTCount());
        System.out.println("Strand: " + test.getStrandString());
        System.out.println("Strand length: " + test.getLength());
        System.out.println("Pattern Count: " + test.getCount("GG"));

    }

    public DNAStrandAdept(String strand) {
        passedStrand = strand;
        if (passedStrand.contains("a") || passedStrand.contains("c")
                || passedStrand.contains("g") || passedStrand.contains("t")) {
            throw new RuntimeException("Illegal DNA strand");
        } else if (passedStrand.contains("1") || passedStrand.contains("2")
                || passedStrand.contains("3") || passedStrand.contains("4")
                || passedStrand.contains("5") || passedStrand.contains("6")
                || passedStrand.contains("7") || passedStrand.contains("8")
                || passedStrand.contains("9") || passedStrand.contains("0")) {
            throw new RuntimeException("Illegal DNA Strand");
        } else if (passedStrand.contains(",") || passedStrand.contains(".")
                || passedStrand.contains("?") || passedStrand.contains("/")
                || passedStrand.contains("<") || passedStrand.contains(">")) {
            throw new RuntimeException("Illegal DNA Strand");
        }
    }

    public int getACount() {
        for (int i = 0; i < passedStrand.length(); i++) {
            if (passedStrand.charAt(i) == 'A') {
                ACount++;
            }

        }
        return ACount;
    }

    public int getCCount() {
        for (int i = 0; i < passedStrand.length(); i++) {
            if (passedStrand.charAt(i) == 'C') {
                CCount++;
            }

        }
        return CCount;
    }

    public int getGCount() {
        for (int i = 0; i < passedStrand.length(); i++) {
            if (passedStrand.charAt(i) == 'G') {
                GCount++;
            }

        }
        return GCount;
    }

    public int getTCount() {
        for (int i = 0; i < passedStrand.length(); i++) {
            if (passedStrand.charAt(i) == 'T') {
                TCount++;
            }
        }
        return TCount;
    }

    public String getStrandString() {
        return passedStrand;
    }

    public int getLength() {
        return passedStrand.length();
    }

    public int getCount(String pattern) {

        for (int i = 0; i < passedStrand.length(); i++) {
            if (passedStrand.contains(pattern)) {
                patternCount++;
            }
        }

        return patternCount;
    }

    public int findPattern(String pattern, int startIndex) {
        return 0;
    }
}

Here is my output:

A count: 1
C count: 0
G count: 4
T count: 2
Strand: AGGTTGG
Strand length: 7
Pattern Count: 7
  • [KMP algorithm](http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm) – Rohit Jain Jan 27 '14 at 17:01
  • @RohitJain I'm sure there is a much simpler way of doing this than that. – user122947 Jan 27 '14 at 17:02
  • Of course there is. Just shared with you an algorithm specifically created for this task. – Rohit Jain Jan 27 '14 at 17:03
  • @RohitJain Can you help me find an easier method of solving the problem that I'm experiencing. – user122947 Jan 27 '14 at 17:04
  • http://stackoverflow.com/questions/767759/occurrences-of-substring-in-a-string – GriffeyDog Jan 27 '14 at 17:07
  • Please use local variables in your functions. A function like `getACount()` should be defined as "return the number of A's in the pattern". It should do that, and not have any other effects. But when you use an outside variable like `ACount`, it has a side effect that could needlessly affect other functions of the class. Plus `ACount` isn't initialized in your function, so if you call it twice it won't give the right result the second time, since it will start with a leftover value. (Of course, you shouldn't have four functions to count the four different letters.) – ajb Jan 27 '14 at 17:24

1 Answers1

1

Notice your for loop:

for (int i = 0; i < passedStrand.length(); i++) {
        if (passedStrand.contains(pattern)) {
            patternCount++;
        }
    }

If the pattern is there in passedStrand, it will always be true. It doesn't really depend upon any part of the loop. Since the loop is running passedStrand.length() number of times, that condition will be checked that many times. And every time, since it is true, the patternCount is incremented. And hence the final value of patternCount will be passedStrand.length();.

What you rather want to do is, starting at every index, check if next pattern.length() number of characters, make up a string equal to pattern. If yes, then increment patternCount. So, you would need to make use of substring method here:

int patternLen = pattern.length();

for (int i = 0; i < passedStrand.length() - patternLen + 1; i++) {
        if (passedStrand.substring(i, i + patternLen).equals(pattern)) {
            patternCount++;
        }
    }

Also notice that, the loop will not really run till the end of passedStrand string. You just need to run till the index, from where there is a possibility of complete occurrence of pattern string.


This method creates extra String objects inside the for loop, due to substring invocation. You can avoid this by using String#indexOf method. You just keep on finding the next index of the pattern in the passedStrand, till you get the index as -1, where it ends.

int startIndex = passedStrand.indexOf(pattern);

while (startIndex != -1) {
    patternCount++;
    startIndex = passedStrand.indexOf(pattern, startIndex + pattern.length());
}

If efficiency is not a big concern, then regex is really sweet. See how:

public int getCount(String pattern) {
    int patternCount = 0;

    Matcher matcher = Pattern.compile(pattern).matcher(passedStrand);

    while (matcher.find()) {
        patternCount++;
    }
    return patternCount;
}
Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
  • This method works so thank you for helping me out but the logic is so complicated. The way you came up with the code is very impressive. I'm sitting here trying to understand it all. – user122947 Jan 27 '14 at 17:18
  • @user122947 updated the last approach. There was a bit of issue. But now it's fine. You can go with that, it's a better approach. – Rohit Jain Jan 27 '14 at 17:23
  • @user122947 Added another regex based approach. – Rohit Jain Jan 27 '14 at 17:27