Compress a string "aaabbccccd" to "a3b2c4d"

Question

I am trying to compress a String. For example, if the user input is "aaabbcccd" - the output should count the letters and if count is above 1, print the letter, then the number: a3b2c3d.

This is what I've come up with but my output in the console is still incorrect.

public class Compress { 
public static void main(String[] args) {

    String a = IO.readString();
    int aLength = aLength.length();
    int repeatedStart = 0;
    int repeatedLength = 1;
    int currentStart = 0;
    int currentLength = 1;

    for(int i = 1; i < aLength; i++) {
        if(a.charAt(i) == a.charAt(i-1)) {
            currentLength+=1;
            if(currentLength > repeatedLength) {
                repeatedStart = currentStart;
                repeatedLength = currentLength;
                IO.outputStringAnswer(repeatedLength+""+a.charAt(repeatedStart));
            }
        } else { 
            currentStart = i;
            currentLength = 1;
            IO.outputStringAnswer(currentLength+""+a.charAt(currentStart));
        }
    }
}
}

My output in the console is:

--------MacBook-Pro:cs ----------$ java Compress
aaaabbcc
RESULT: "2a"
RESULT: "3a"
RESULT: "4a"
RESULT: "1b"
RESULT: "1c"

I know that my outputStringAnswer is definitely in the wrong place. Any help would be greatly appreciated.

Thank you

Isn't your terminal flare (forgot the name) overly long? :) I mean, doesn't the line break a little too often? :p As for something constructive: Find out what your program is doing with some debug statements (print). — keyser, Apr 05 '14 at 19:05

kiruwka · Answer 1 · 2014-04-05T19:32:36.377

Simplified version of your code that actually works :

public static void main(String... args) {
    String a = "aaabbccccd";
    int currentLength = 1;

    StringBuilder result = new StringBuilder(); // empty string
    for (int i = 1; i < aLength; i++) {
        if (a.charAt(i) == a.charAt(i - 1)) {
            currentLength++;
        } else {
            System.out.println(currentLength + "" + a.charAt(i - 1));
            result.append(a.charAt(i - 1))
                .append(currentLength > 1 ? currentLength : "");
            currentLength = 1; // reset the length
        }
    }

    // print last one
    System.out.println(currentLength + "" + a.charAt(a.length() - 1));
    result.append(a.charAt(a.length() - 1))
        .append(currentLength > 1 ? currentLength : "");
    System.out.println("result = " + result.toString());
}

outputs :

3a
2b
4c
1d

result = a3b2c4d

What if I have string like String a = "aaabbcccdaa"; and want the output as 5a2b3c2a — Veswanth, Oct 09 '19 at 10:59

score 1 · Answer 2 · answered Apr 05 '14 at 19:26

Your if-else statement needs little modification. Try to run following code.

public static void main(String[] args) {
    String a="aaaaaaaabbbbbbcccccccccccccd";
    char first=a.charAt(0);
    int recur=0;

    StringBuilder res=new StringBuilder();
    for (int i = 1; i <a.length(); i++) {
        if(first==a.charAt(i)){
           recur++;
        }
        else{
         if (recur>0)
         res.append(first).append(recur);
         recur=0;
         first=a.charAt(i);
        }
    }
    if (recur>0)
        res.append(first).append(recur);
    else
        res.append(first);
    System.out.println(res);
}

Gene · Answer 3 · 2014-04-06T18:37:07.497

You have some right ideas here, but the algorithm has bugs.

I want to teach you to think in terms of a loop invariant: a statement that is always true for each loop iteration and also true when the loop exits. If you choose the invariant well, the exit condition is also the answer or perhaps just a trivial step away from the answer.

In this manner you get a loop that you're confident will work before the code runs. If something doesn't work as expected, you have a rigorous way to think about what's going wrong. Posting to SO is not this!

E.g. a good clue that your code is broken is to give it a string of length 1. Your code outputs nothing. If you were thinking about loop invariants, you'd be unlikely to end with this problem.

Here is a proposal for a loop invariant for your code:

Assume: a[0] exists
For the substring that starts at 0 and ends at k,
  runLen is the length of the final run ending at k
  Any runs prior to final have been output.
  runChar is the character of the run

How did I come up with this? The desired outputs are run length and run character. This makes these quantities great candidates for variables in your program. A prefix of the input is another idea that appears frequently in a good invariant. When you guess wrong about quantities to put in the invariant, it will show up as an inability to make it true either initially or after each loop iteration or excessively complex reasoning to make it true. The gap between what you can make true and what the invariant says will usually point to something to add or change. The only way to get better at choosing invariants is to practice. Since a good invariant is equivalent to writing a good loop, this is a great place to focus your studies of programming technique.

Back to the invariant, When k has reached a.length - 1 with the invariant still true, all we need to do is output the last run, and we have the answer.

Now think about how to establish the invariant with k at 0. In this case, he current run is just the first character:

runLen = 1   // the single character a[0]
runChar = a[0]

Now think about how to re-establish the invariant after k has been increased by 1. Well, there are two cases. Either the new a[k] matches runChar or it doesn't. We need to re-establish the invariant for each case. So for the loop we end up with:

for k = 1 to a.length - 1

  // Here the k'th character has violated the invariant

  if a[k] == runChar
    // new k'th character matches the run character
    ++runLen // run has grown by one.
    // run char has not changed
  else
    // new k'th character starts a new run
    Output runLen, runChar // re-establish invariant that previous runs are output
    runLen = 1 // re-establish invariant about run length
    runChar = a[k] // re-establish invariant about the run character

  // Here the invariant is correct again!

As noted above, after the loop exits, the only work left is to get from the invariant with k == a.length - 1 to the result you require. All that's missing is to output the final run:

Output runLen, runChar

Coding this in Java is pretty easy.

public class RLE {

     public static void main(String [] args) {
         String a = "aaabbcccd";
         StringBuilder sb = new StringBuilder();

         // Establish invariant for k == 0.
         int runLen = 1;
         char runChar = a.charAt(0);

         // Advance k, re-establishing invariant for each step.
         for (int k = 1; k < a.length(); k++) {
             if (a.charAt(k) == runChar) {
                 ++runLen;
             } else {
                 sb.append(runChar).append(runLen); 
                 runLen = 1;
                 runChar = a.charAt(k);
             }
         }
         // Output the final run to get from invariant to required answer.
         sb.append(runChar).append(runLen);     
         System.out.println(sb.toString());
     }
}

score 0 · Answer 4 · answered Apr 05 '14 at 19:14

Your algorithm is not quite right. You need two nested for loops. The basic approach is:

For each character in the string, calculate how many characters immediately to its right are the same character, and break out once you find a difference

//basic pseudo-code, untested
for(int i = 0; i < someString.length() - 1; i++) { 
    char next = someString.charAt(i+1);
    int counter = 1;
    while(next == c) { 
        //increment counter, etc
    }
}

There is no need for an if/else to special case length == 1

Compress a string "aaabbccccd" to "a3b2c4d"

4 Answers4