1

I have the following code:

String[] names = {"aa", ..........., "bb"};
for (int i = 0; i < names.length; i++) {
    if (names[i].toLowerCase().startsWith(query.toLowerCase()))
        c.addRow(new Object[]{i, names[i]});
}

Since the array names can be long, I'm wondering what is the best way to write this code from performance point of view. In this way the loop is O(N). Is there any java data structure to do the same thing in a quicker way?

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
greywolf82
  • 21,813
  • 18
  • 54
  • 108
  • Is the array "names" lexicographically ordered? – pokeRex110 Nov 10 '15 at 17:40
  • 1
    If `names` is sorted, you can do a binary search for the first correct value and then loop forwards and backwards in the array to find the rest of the matches, and that should be about as efficient as possible – phflack Nov 10 '15 at 17:46

2 Answers2

1

How about a Prefix tree?

refer to:

How to create a simple prefix index in Java?

http://www.toptal.com/java/the-trie-a-neglected-data-structure

Community
  • 1
  • 1
柯鴻儀
  • 613
  • 1
  • 10
  • 25
1

You can order the names, use binary search with a case-insensitive comparator to find the insertion spot for the potential prefix, and walk the array to catch all other words with the same prefix:

// At preparation time
Arrays.sort(names, String.CASE_INSENSITIVE_ORDER);
...
// At query time
int pos = Arrays.binarySearch(names, query, String.CASE_INSENSITIVE_ORDER);
if (pos < 0) {
    pos = -(pos+1);
}
while (pos < names.length) {
    if (names[pos].toLowerCase().startsWith(query.toLowerCase())) {
        c.addRow(new Object[]{pos, names[pos]});
        pos++;
    } else {
        break;
    }
}

Arrays.binarySearch finds an insertion point. If the name matches, pos would be non-negative; otherwise, you need to convert it to a valid index with this expression: -(pos+1) If query is a proper prefix, its insertion point would be in front of the first name with a matching prefix. Since names is sorted, all entries with the same prefix would be next to each other. That's why you can walk the list linearly until first mismatch, and stop at that point.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523