3

I have an arraylist<string> of words. I sort it using Collections.sort(wordsList);

I'm using this array for an auto-suggest drop down box, so that when the user is typing in a letter, they are given a list of suggestions similar to what they are typing in.

How do I go about searching this array for a prefix of string, say the user types in "mount" and the array contains the word "mountain", how can I search this array and return similar values.

Here's my code so far:

public List<Interface> returnSuggestedList(String prefix) {
    String tempPrefix = prefix;
    suggestedPhrases.clear();
    //suggestedPhrases = new ArrayList<Interface>();
    //Vector<String> list = new Vector<String>();
    //List<Interface> interfaceList = new ArrayList<Interface>();
    Collections.sort(wordsList);
    System.out.println("Sorted Vector contains : " + wordsList);
    int i = 0;
    while (i != wordsList.size()) {
        int index = Collections.binarySearch(wordsList, prefix);
        String tempArrayString = wordsList.get(index).toString();
        if (tempArrayString.toLowerCase().startsWith(prefix.toLowerCase())) {
            ItemInterface itemInt = new Item(tempArrayString);
            suggestedPhrases.add(itemInt);
            System.out.println(suggestedPhrases.get(i).toString());
            System.out.println("Element found at : " + index);
        }
        i++;
    }
    return suggestedPhrases;
}
Community
  • 1
  • 1
EI756
  • 557
  • 1
  • 8
  • 19

6 Answers6

3

The most basic approach would be

List<String> result = new ArrayList<String>();
for(String str: words){
  if(str.contains(keyword){
    result.add(str);
  }
}

You can improve this version, if you only concern with startWith instead of contains then you can distribute words in a HashMap and you will have narrowed search

jmj
  • 237,923
  • 42
  • 401
  • 438
2

For this task, there are better data structures than a sorted array of strings. You might look e.g. at DAWG (Directed acyclic word graph).

Jiri Kriz
  • 9,192
  • 3
  • 29
  • 36
1

If wordList is fixed (does not change from one method call to the other) you should sort it somewhere else, because sort is costly, and store it in lowercase.

In the rest of the method you would do something like:

List<String> selected = new ArrayList<String>();

for(String w:wordList){
    if(w.startsWith(prefix.toLower())) // or .contains(), depending on 
        selected.add(w);     // what you want exactly 
}

return selected;
Matyas
  • 13,473
  • 3
  • 60
  • 73
1

As @Jiri says you can use a DAWG, but if you don't want to go that far you can do some simple and useful things.

Make use of the sorting

  • If you want to sort the array of words do it previously. don't sort it each time
  • As it's sorted you can find the first and the last word in the list that are matches. The use list.subList(from, to) to return sublist. It's a little more optimal that adding each one.

Use a pre-sorted structure

  • Use a TreeSet<String> for storing the strings (the will be sorted internally).
  • Then use treeSet.subSet(from, true, to, false);

Where from is the prefix and to is the "prefix plus one char". By example if you're looking for abc, to must be abd. If you don't want to make that char transformation anyway you can ask for treeSet.headSet(from) and iterate over it until there are no more prefixes.

This is specially useful if you read more than you write. Maybe ordering strings is a little expensive but once ordered you can find them very fast (O(log n)).

Case insensitive comparing

You can provide a Comparator<String> to the tree set in order to indicate how it must order the strings. You cam implement it or maybe there are a prebuild case-insensitive comparator over there.

Anyway its code should be:

int compare(String a, String b) {
   return a.toLowerCase().compareTo(b.toLowerCase());
}
helios
  • 13,574
  • 2
  • 45
  • 55
1

Also see the trie data structure. This question has useful info. I should think its getPrefixedBy() will be more efficient than anything you can roll by hand quickly.

Of course, this will work for prefix searches only. Contains search is a different beast altogether.

Community
  • 1
  • 1
Miserable Variable
  • 28,432
  • 15
  • 72
  • 133
0

Here is a similar example:

-> http://samuelsjoberg.com/archive/2009/10/autocompletion-in-swing

Tobias
  • 9,170
  • 3
  • 24
  • 30