You have some right ideas here, but the algorithm has bugs.
I want to teach you to think in terms of a loop invariant: a statement that is always true for each loop iteration and also true when the loop exits. If you choose the invariant well, the exit condition is also the answer or perhaps just a trivial step away from the answer.
In this manner you get a loop that you're confident will work before the code runs. If something doesn't work as expected, you have a rigorous way to think about what's going wrong. Posting to SO is not this!
E.g. a good clue that your code is broken is to give it a string of length 1. Your code outputs nothing. If you were thinking about loop invariants, you'd be unlikely to end with this problem.
Here is a proposal for a loop invariant for your code:
Assume: a[0] exists
For the substring that starts at 0 and ends at k,
runLen is the length of the final run ending at k
Any runs prior to final have been output.
runChar is the character of the run
How did I come up with this? The desired outputs are run length and run character. This makes these quantities great candidates for variables in your program. A prefix of the input is another idea that appears frequently in a good invariant. When you guess wrong about quantities to put in the invariant, it will show up as an inability to make it true either initially or after each loop iteration or excessively complex reasoning to make it true. The gap between what you can make true and what the invariant says will usually point to something to add or change. The only way to get better at choosing invariants is to practice. Since a good invariant is equivalent to writing a good loop, this is a great place to focus your studies of programming technique.
Back to the invariant, When k
has reached a.length - 1
with the invariant still true, all we need to do is output the last run, and we have the answer.
Now think about how to establish the invariant with k
at 0
. In this case, he current run is just the first character:
runLen = 1 // the single character a[0]
runChar = a[0]
Now think about how to re-establish the invariant after k
has been increased by 1. Well, there are two cases. Either the new a[k]
matches runChar
or it doesn't. We need to re-establish the invariant for each case. So for the loop we end up with:
for k = 1 to a.length - 1
// Here the k'th character has violated the invariant
if a[k] == runChar
// new k'th character matches the run character
++runLen // run has grown by one.
// run char has not changed
else
// new k'th character starts a new run
Output runLen, runChar // re-establish invariant that previous runs are output
runLen = 1 // re-establish invariant about run length
runChar = a[k] // re-establish invariant about the run character
// Here the invariant is correct again!
As noted above, after the loop exits, the only work left is to get from the invariant with k == a.length - 1
to the result you require. All that's missing is to output the final run:
Output runLen, runChar
Coding this in Java is pretty easy.
public class RLE {
public static void main(String [] args) {
String a = "aaabbcccd";
StringBuilder sb = new StringBuilder();
// Establish invariant for k == 0.
int runLen = 1;
char runChar = a.charAt(0);
// Advance k, re-establishing invariant for each step.
for (int k = 1; k < a.length(); k++) {
if (a.charAt(k) == runChar) {
++runLen;
} else {
sb.append(runChar).append(runLen);
runLen = 1;
runChar = a.charAt(k);
}
}
// Output the final run to get from invariant to required answer.
sb.append(runChar).append(runLen);
System.out.println(sb.toString());
}
}