6

Currently I am using two nested for loop to generate all the substrings of a string. I heard about Suffix Tree but AFAIK Suffix Tree generates suffix not the substrings. Following is the code which currently i am using-

        String s = "abacbccca";
        int l = s.length();
        for (short c = 0; c < l; c++) {
            for (short r = 0; r < l - c; r++){
                Sting ss=s.substring(c, c + r + 1);                                        
                if(!t.contains(ss));
                    t.add(ss);
            }
        }

I want a way which can generate all the substrings in less than O(n^2). Although by seeing my code, anyone can suggest me that it's impossible, as i am adding every substring to a list. But my objective is not to store all the substrings, my objective is to find a string which is lexicographically ith smallest string.

ravi
  • 6,140
  • 18
  • 77
  • 154
  • If you are interested in the lexicographically smallest string only, then I am afraid Niklas B. below is right. But if you are interested in a O(n)-size data structure that allows you to access the i-th smallest string for any given i, as your question seems to suggest, then perhaps my answer to [this question](http://stackoverflow.com/q/9389681/777186) helps. – jogojapan Apr 11 '12 at 13:55
  • @jogojapan: Yeah.... that what i want... Thank you so so so much... – ravi Apr 11 '12 at 14:28

1 Answers1

14

There are O(n^2) different substrings, so no algorithm can enumerate them all with a complexity better than O(n^2)!

The problem of finding the lexicographically smallest substring is a totally different one, though. It's always the empty string, so that's an O(1) operation (and a very pointless one, too).

Niklas B.
  • 92,950
  • 18
  • 194
  • 224
  • Okay.... Can you please elaborate more about finding `lexicographically` smallest substring? – ravi Apr 11 '12 at 13:14
  • @Ravi: Typically, when lexicographically comparing strings, [if one of the strings is a prefix of the other, it's considered smaller](http://en.wikipedia.org/wiki/Lexicographical_order#Ordering_of_sequences_of_various_lengths). So the smallest substring is always the empty string: `String smallestSubstring(String x) { return ""; }` By the way, I wrote that in my answer as well. – Niklas B. Apr 11 '12 at 13:20
  • If `null` is not allowed, then??? .. Let say we have string abacba, it has these unique substrings `{a,ab,aba,abac,abacb,abacba,b,ba,bac,bacb,bacba,ac,acb,acba,c,cb,cba}` . Now lexicographically smallest string is the above set is `a`. – ravi Apr 11 '12 at 13:23
  • @Ravi: No, it also has the substring `""` (empty string, has nothing to do with `null`!), which is the smallest. If you exclude the empty string (which wouldn't be consequent), then the smallest substring is always the smallest substring of length 1. – Niklas B. Apr 11 '12 at 13:23
  • But i am saying about the `lexicographically smallest substring`. In the above example, we have three smallest substrings of length 1, which are `a`,`b` and `c`. But lexicographically smallest substring is `a` not `b` or `c`. – ravi Apr 11 '12 at 13:27
  • @Ravi: I can only repeat myself: "the smallest substring is always the smallest substring of length 1". In this case, `{a,b,c}` are the substrings of length 1 and `a` is the smallest of them. Sorry, but you seem to be a bit confused about this. What are you trying to achieve? I don't see any real world usage of this. – Niklas B. Apr 11 '12 at 13:28
  • @NiklasB. Sorry, Niklas but I found your answer quite funny! Thanks for the chuckle. – HiChews123 Oct 18 '16 at 06:13