0

Suppose I have a String

  interpreter, interprete, interpret

now what i want to do is to get the smallest matching string from the above string that must be:

  interpret

Is it possible using Java if it is can somebody help me out digging this problem thanks

  • Are you looking for the word stem specifically or are you trying to just get the smallest common set of starting characters. Stemming in itself is an art (in JAVA) so the distinction is important. – radimpe Mar 11 '14 at 08:50
  • 1
    You want to include this schenerio too inter , inteper ,inteq so it answer will be inte which is not element, so your answer will be substring of any element not actual element ?? – Kick Mar 11 '14 at 08:57

4 Answers4

0

If I get you correctly, you want the shortest word in an input string s which includes a target string t="interpret".

So first, split the string into words w, e.g., using s.split("\\s*,\\s*"), then use w.contains(t) on each string w to check if it contains the word you look for. Choose the shortest string for which the contains method returns true.

gexicide
  • 38,535
  • 21
  • 92
  • 152
0

Check out this.....

public static void main(String[] ar)
    {
        List<String> all=new LinkedList<String>();
        all.add("interpreter");
        all.add("interprete");
        all.add("interpret");
        String small="";
        small=all.get(0);
        for (String string : all) {
            if(small.contains(string))
            {
                small=string;
            }
        }
        System.out.println(small);
    }

Let me know, Is it satisfying your requirement???

//-----------------Edited One--------------------------

public static void main(String[] ar)
{
List<String> all=new LinkedList<String>();
Set<String> result=new LinkedHashSet<String>();
all.add("interpreter");
all.add("interprete");
all.add("interpret");
all.add("developed");
all.add("develops");
String small="";

for(int i=0;i<all.size();i++)
{
    small=all.get(i);
    for(int j=i;j<all.size();j++)
    {
        if(small.contains(all.get(j)))
            {
                small=all.get(j);
            }
    }
    result.add(small);
}
for (String string : result) {
    System.out.println(string);
}
}
user2314868
  • 139
  • 1
  • 5
  • 12
0
you need to compare all char one by one of all  string and a array of boolean flag maintain 
for every pair of string then check out all Boolean array similarity(length) and then substring 
of any string from that length
i hope this will help    
0

What you are looking for is called a lemmatizer/steamer for Java.

There are a few of them (I have not used any) but you may want to search/try a few of them:

Snowball

Lemamatization

You should test each of them, because for example some (in case of snowball) will do:

 Community     
 Communities  --> Communiti // this is obviously wrong
Community
  • 1
  • 1
Eugene
  • 117,005
  • 15
  • 201
  • 306