2

In one of these HackerRank Java challenges, there is a problem which is defined as:

The problem

We define the following terms:

  • Lexicographical Order, also known as alphabetic or dictionary order, orders characters as follows: A < B < ...< Y < Z < a < b ... < y < z

  • A substring of a string is a contiguous block of characters in the string. For example, the substrings of abc are a, b, c, ab, bc, and abc.

Given a string, s, and an integer, k, complete the function so that it finds the lexicographically smallest and largest substrings of length k.

Here is my (not fully working) solution:

My code

import java.util.*;

public class stringCompare {

    public static String getSmallestAndLargest(String s, int k) {
        String smallest, largest, temp;

        /* Initially, define the smallest and largest substrings as the first k chars */
        smallest = s.substring(0, k);
        largest = s.substring(0, k);

        for (int i = 0; i <= s.length() - k; i++) {
            temp = s.substring(i, i + k);
            for (int j = 0; j < k; j++) {

                /* Check if the first char of the next substring is greater than the largest ones' */
                if (temp.charAt(j) > largest.charAt(j)) {
                    largest = s.substring(i, i + k);
                    break;      
                }

                /* Check if the first char of the next substring is less than the smallest ones' */
                else if (temp.charAt(j) < smallest.charAt(j)) {
                    smallest = s.substring(i, i + k);
                    break;
                } 

                /* Check if the first char of the next substring is either equal to smallest or largest substrings' */
                else if (temp.charAt(j) == smallest.charAt(j)
                        || temp.charAt(j) == largest.charAt(j)) {
                    // If so, move to the next char till it becomes different
                } 

                /* If the first of char of the next substring is neither of these (between smallest and largest ones')
                    skip that substring */ 
                else {
                    break;
                }
            }
        }

        return smallest + "\n" + largest;
    }

    public static void main(String[] args) {
        String s;
        int k;
        try (Scanner scan = new Scanner(System.in)) {
            s = scan.next();
            k = scan.nextInt();
        }

        System.out.println(getSmallestAndLargest(s, k));
    }
}

According to the HackerRank, this code fails for 2 out of 6 cases. One is as follows:

ASDFHDSFHsdlfhsdlfLDFHSDLFHsdlfhsdlhkfsdlfLHDFLSDKFHsdfhsdlkfhsdlfhsLFDLSFHSDLFHsdkfhsdkfhsdkfhsdfhsdfjeaDFHSDLFHDFlajfsdlfhsdlfhDSLFHSDLFHdlfhs
30

The expected output is:

ASDFHDSFHsdlfhsdlfLDFHSDLFHsdl
sdlkfhsdlfhsLFDLSFHSDLFHsdkfhs

But mine becomes:

DFHSDLFHDFlajfsdlfhsdlfhDSLFHS
sdlkfhsdlfhsLFDLSFHSDLFHsdkfhs

At debug mode, I found that the smallest substring was correct until the 67th iteration (i). I don't know why it changes to a wrong one at that step but it does.

Can anyone help me on that, please?

Thanks!

Said
  • 187
  • 1
  • 2
  • 11
  • 1
    I understand that you're doing this as an exercise, but as a tip, `String` implements `compareTo` which you can use to get the smallest or largest string. It is also a `@HotSpotIntrinsicCandidate` so it should generally preform better too (I know these online challenges sometimes rank by performance). – Jorn Vernee Feb 16 '18 at 12:38
  • I didn't know that method until now, actually. Thank you for the suggestion. Besides, I don't care about those rules anymore. Some of the users here tells me that those tasks on HackerRank require obsolete ways to solve problems, like you do, too :) I'm trying to learn as much as I can. – Said Feb 16 '18 at 12:57
  • Your mistake is that at position i + j when the current string starting at s is still equal to the smallest or largest so far and you encounter a character that is less than the largest or greater than the smallest you reset these with the current string at s. – laune Feb 16 '18 at 13:09

3 Answers3

3

The problem is that you are trying your comparison for both largest and smallest in a single loop. More specifically, this line is problematic:

else if (temp.charAt(j) == smallest.charAt(j)
      || temp.charAt(j) == largest.charAt(j)) {
    // If so, move to the next char till it becomes different
}

You may want to continue the loop on j to detect the smallest substring, but break out of the loop on j to detect the largest substring. That's why the two checks should be done independently of each other.

A few minor points to consider:

  • You do not need to write largest = s.substring(i, i + k), because it is the same as largest = temp; same goes for smallest.
  • You do not need the nested loop at all compareTo performs lexicographic comparison for you.

Essentially, your program could be reduced to this:

largest = smallest = s.substring(0, k);
for (int i = 1 ; i <= s.length() - k; i++) {
    temp = s.substring(i, i + k);
    if (temp.compareTo(largest) > 0) {
        largestt = temp;
    } else if (temp.compareTo(smallest) < 0) {
        smalles = temp;
    }
}

Note that the loop can start from i = 1 because you used s.substring(0, k) to initialize both largest and smallest.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • If you set largest, you don't need to check for smaller => else++ – laune Feb 16 '18 at 12:21
  • 1
    And the loop should start with i = 1 – laune Feb 16 '18 at 12:22
  • @laune Thank you for the comment! – Sergey Kalinichenko Feb 16 '18 at 12:24
  • YAW. Should we start thinking about optimization now? – laune Feb 16 '18 at 12:25
  • Honestly, I wasn't aware of `compareTo` method. Seems like my code is just a waste when we have `compareTo` method. Thanks for this! But I still struggle to grasp the mistake on making a comparison at the same time. Could you explain it a little clear please? :) – Said Feb 16 '18 at 12:52
  • 1
    @SaidBuyukarslan When you are comparing to two strings at a time, there may be a situation when you wish to make a decision on one of the strings, and keep going on the other one. Henry gave you a pretty good example: when your `temp` is `"bc"`, the largest is `"bx"`, and the smallest is `"ad"`, you want to stop considering `"bc"` for a potential smallest string after the first iteration, because `a` is ahead of `b`. However, since `largest` starts in `b`, the iteration continues to the second index, at which point the algorithm incorrectly concludes that "`bc`" is less than `"ad"`. – Sergey Kalinichenko Feb 16 '18 at 13:40
2

You cannot compare a certain substring to both the smallest so far and the largest so far at the same time. Especially the condition

temp.charAt(j) == smallest.charAt(j)
                    || temp.charAt(j) == largest.charAt(j)

is problematic. Take for example

smallest   ad
largest    bx
temp       bc

In this example your code will conclude that bc is smaller than ad

Henry
  • 42,982
  • 7
  • 68
  • 84
  • Ah, now I see why my code breaks down. When it catches same starting letters, it deviates in following characters. But I still wonder, is double comparison at the same time would be wrong for the cases like that, or is there any other reason? – Said Feb 16 '18 at 12:50
  • 1
    the two comparisons may need to go to different length. So it does not make sense to combine them in one loop. (You could still do that by using some flags indicating when a comparison is finished but this makes your code much less obvious). – Henry Feb 16 '18 at 12:55
  • Hmm, boundary conditions, I guess and yeah, I thought the same flag-check method at first hand but as you've said, it would make the code much complex and less readable for nothing. Thanks for your time! – Said Feb 16 '18 at 13:02
1

I propose a simple optimisation: a quick peek at the first characters.

largest = smallest = s.substring(0, k);
for (int i = 1; i <= s.length() - k; i++) {
    if (s.charAt(i) > largest.charAt(0) ){
      largest = s.substring(i, i + k);
      continue;
    }
    if (s.charAt(i) < smallest.charAt(0) ){
      smallest = s.substring(i, i + k);
      continue;
    }

    if (s.charAt(i) == largest.charAt(0) ){
        String temp = s.substring(i, i + k);
        if( temp.compareTo(largest) > 0) {
            largest = temp;
            continue;
        }
    }
    if (s.charAt(i) == smallest.charAt(0) ){
        String temp = s.substring(i, i + k);
        if( temp.compareTo(smallest) < 0) {
            smallest = temp;
        }
    }
}

For the example, comparisons drop from 222 to 14.

laune
  • 31,114
  • 3
  • 29
  • 42
  • I'm sorry but what's the point on `s.charAt(i) >= largest.charAt(0)` check? This condition is already being tested by `temp.compareTo(largest) > 0` expression, isn't it? – Said Feb 16 '18 at 13:14
  • 1
    It's a quick (!) check whether the compareTo has a chance. I don't know whether the test cases contain a string with a zillion characters; if all are fairly short or they don't have a time limit you may not need an optimization. – laune Feb 16 '18 at 14:33
  • @SaidBuyukarslan Did the program need to be faster for passing the online test? If so: what was the length of the longest string? – laune Feb 16 '18 at 14:44
  • So your point is `compareTo` takes the whole substring first, then examines it but checking only it's first character before that process makes this search way faster. I think it'll be better for me to try to understand deeply the `compareTo`'s check method. – Said Feb 16 '18 at 14:44
  • 1
    @laune I strongly doubt that this optimization would give you much help: it lets you skip `compareTo` in situations when it would exit immediately anyway, because the algorithm stops at the first difference. Moreover, you've already "paid" for `substring`, which is by far the most expensive operation in your loop, because it's the only one requiring memory allocation. – Sergey Kalinichenko Feb 16 '18 at 14:45
  • @laune This one wasn't an online test, seems to be considered as an easy level exercise. But yeah, it had some long test cases like these two: [case1](https://hr-testcases-us-east-1.s3.amazonaws.com/8234/input05.txt?AWSAccessKeyId=AKIAJ4WZFDFQTZRGO3QA&Expires=1518799532&Signature=9GhJCJZuib4fN4UixOlYULcyEss%3D&response-content-type=text%2Fplain) [case2](https://hr-testcases-us-east-1.s3.amazonaws.com/8234/input04.txt?AWSAccessKeyId=AKIAJ4WZFDFQTZRGO3QA&Expires=1518799548&Signature=IXOrjRmJQhVhU6RudGOGsibnDgE%3D&response-content-type=text%2Fplain)Sorry for the links but the strings were too big – Said Feb 16 '18 at 14:58
  • @dasblinkenlight I already did have this better (though longer) version. - But substr doesn't need to allocate memory. It creates a String object, using the same char[], just setting start and end offsets, see the private constructor String( int, int, char[] ) - Or was this changed in some later Java version? It's difficult to keep abreast of these changes. – laune Feb 16 '18 at 15:26
  • @laune Although `substring` does not allocate memory for `String`'s content, JVM must still allocate space for the `String` object that represents substring. – Sergey Kalinichenko Feb 16 '18 at 15:28
  • @dasblinkenlight That's something that will be extremely quick, just taking it from the "Eden" part of the "Young" section. – laune Feb 16 '18 at 15:32
  • @laune My point is that avoiding `compareTo` is a premature micro-optimization in comparison to a constructor call that always allocates memory. – Sergey Kalinichenko Feb 16 '18 at 15:34
  • 1
    @dasblinkenlight And how do you propose to call compareTo without reallocating a new object at every character position of s? - Note that the improved version doesn't allocate an object until compareTo has to be called. – laune Feb 16 '18 at 15:37
  • @laune The modified version is free from the problem I am talking about (allocating before comparing the initial character). The only way to avoid both the call of `compareTo` and the allocation is to roll your own `compareTo` that takes string offsets and lengths. – Sergey Kalinichenko Feb 16 '18 at 15:40