0

I have a array that contains versions of an article. I want to implement a binary search function that return the first version that contains a given string. for example the will return 4 for the next array :

 array[0] = my name is foo and I'm a programmer.
 array[1] = my name is bar and I'm a programmer.
 array[2] = my name is foo and I'm a programmer.
 array[3] = my name is and I'm a programmer.
 array[4] = my name is foo and I'm a programmer.
 array[5] = my name is foo.

Here what I have done so far :

private static Revision binarySearch(Revision[] array, int left, int right,
    String value) throws IOException {

    if (left > right)

       //I don't know what to put here;

    int middle = (left + right) / 2;
    Revision rv = array[middle];
    String text = rv.getText();
    if (!containsTemplate(text, value))
        return binarySearch(array, left, middle - 1, template);
    else
        return binarySearch(array, middle + 1, right, template);

    }
Hunsu
  • 3,281
  • 7
  • 29
  • 64
  • 4
    Binary search doesn't do that, unless there's a bunch of things you're not telling us. – hobbs Aug 30 '14 at 21:19
  • http://stackoverflow.com/questions/3183582/what-is-the-fastest-substring-search-algorithm – Alan Stokes Aug 30 '14 at 21:21
  • I want a Java function that to search for a given string in the Revisions of a wikipedia article. An existing one is http://wikipedia.ramselehof.de/wikiblame.php?user_lang=en&lang=en&project=wikipedia&article=&needle=&skipversions=0&ignorefirst=0&limit=500&offtag=30&offmon=8&offjahr=2014&searchmethod=int&order=desc&user= – Hunsu Aug 30 '14 at 21:23
  • @AlanStokes That question isn't related to mine. – Hunsu Aug 30 '14 at 21:25
  • 1
    Binary search requires a sorted input. For example `TreeSet` or `TreeMap`. – Elliott Frisch Aug 30 '14 at 21:32
  • And even if you sort it, binary search still won't help you, since the property you're interested in has nothing to do with the ordering relation. – user2357112 Aug 30 '14 at 21:35
  • @ElliottFrisch Why I can't use the binary search for my input? I search the value in the middle, if it contains the string so I search the half right otherwise I search the left half. Perhaps I shouldn't call it binary search but it split the array to search in two at every step. – Hunsu Aug 30 '14 at 21:39
  • @user230137 And how does that find the first occurrence? – Elliott Frisch Aug 30 '14 at 21:41
  • That's my question. I have a linear algorithm that run through 0 to array.length. When a value at an index i doesn't have the string it return i -1. Actually I use that algorithm when the array have only 10 elements. – Hunsu Aug 30 '14 at 21:45
  • @user230137 In your example, what string are you searching for? foo? – Paul Boddington Aug 30 '14 at 23:08
  • No I'm searching for programmer. – Hunsu Aug 30 '14 at 23:18
  • @user230137 So why is the first appearance 4 and not 0? Are the strings in your array in reverse-chronological order? – Paul Boddington Aug 30 '14 at 23:44
  • @pbabcdefp That's right! – Hunsu Aug 31 '14 at 07:59

1 Answers1

-1

Assuming this is sorted data so binary search is appropriate:

1) Binary search for an instance of the value you're looking for.

2) Then iterate back toward the start to determine whether this is the first instance, or to find that first instance if it isn't.

(In other words, the same way you'd solve it by hand.)

keshlam
  • 7,931
  • 2
  • 19
  • 33