0

I have a piece of code which contains

    Collection<String> tok=Arrays.asList(tokens);
    HashSet<String> lookup=new HashSet<String>();  

       while(!lookup.containsAll(tok)&&max<N)
         {
         }

Using toString() I found that even though the HashSet contains a collection still containsAll method returns false.I am using remove method in the code but it is never being called.The complete code is here on pastebin which would be more readable.

The purpose is to take a input string and another k strings and search for minimum sub sequence in input string which contains all k strings

1)Start with index 0 in input string and add first k strings to the HashSet because this is the minimum sequence that can contain k different tokens

2)After that take the range min=0 to max=k and keep adding string at position max and incrementing max until the set contains all tokens

3)When all tokens are found remove string a position min(initially 0) and increment min.If after removal all tokens are not present in HashSet. Set found to false so that step 2 is repeated in next iteration for interval starting with this value of min

4)If max-min is smaller than previous difference the new minimum subsequence is min-max

For input as

 This is a test. This is a programming test. This is a programming test in any language.
 k=4
 this
 a
test
programming

The output is

 tokens are  [this, a, test, programming]
Increasing Max [is, test, a, this]  found =false
Increasing Max [is, test, a, this]  found =false
Increasing Max [is, test, a, this]  found =false
Increasing Max [is, programming, test, a, this]  found =false
Increasing Max [is, programming, test, a, this]  found =false
 Increasing Max [is, programming, test, a, this]  found =false
 Increasing Max [is, programming, test, a, this]  found =false
Increasing Max [is, programming, test, a, this]  found =false
Increasing Max [is, programming, test, a, this]  found =false
Increasing Max [is, programming, test, a, this]  found =false
Increasing Max [is, programming, test, a, in, this]  found =false
Increasing Max [is, programming, test, any, a, in, this]  found =false
Increasing Max [is, programming, test, any, a, language, in, this]  found =false

No subsegment found

The output shows that remove was never called still containsAll() kept returning false even when it contained all strings present in the collection.

Why does it keep returning false even though remove is never called?

Perhaps a HashSet wouldn't work even if the above two issues were resolved.For an input like

 This is a this test.
 2
 this
 test

Since the this at index 3 won't be added to the set.The produced min interval would be [0-4] instead of [3-4] So is there a collection which may contain duplicate values and has a containsAll method or Will i have to use a HashMap with indexes of strings as keys?

arne.b
  • 4,212
  • 2
  • 25
  • 44
rakesh99
  • 1,234
  • 19
  • 34
  • 3
    Clean-up the indentation and line breakings of the code. It's hard to read now. – msell Nov 12 '12 at 05:33
  • I have tried to properly indent it now but since i can't get the full view of the code at once i couldn't see if all lines were properly indented.Rechecking now – rakesh99 Nov 12 '12 at 05:56
  • Too many questions. Please ask one question per post, see [Ask] and [FAQ]. – Jim Garrison Nov 12 '12 at 06:28
  • Yes there are 4 questions but 1,2 and 4 are very related and all of them are about the consistency of contains method of a hash based structure.Should i move them to different posts? – rakesh99 Nov 12 '12 at 07:11
  • @JimGarrison I posted questions other than specific to this code here http://stackoverflow.com/questions/13339973/consistency-of-contains-method-on-a-hashset-and-hashmap.I will remove them from this post – rakesh99 Nov 12 '12 at 07:54

1 Answers1

1

Looking at the code on pasteBin, it seems in the loop that contains System.out.println(" Increasing Max "+lookup.toString()+" found ="+found);, you never call lookup.containsAll(tok), i.e. the fact that it outputs false in every loop iteration is due to found being false before.

Some other points:

  • Do not call System.exit in your code. (Well, unless you have caught a really serious exception or error you cannot recover from, which will not happen with the problem you are currently trying to solve).
  • Use a for loop if you know the number of iterations in advance. If you do not, and especially if loop termination depends on more than one variable, a while loop will be much more readable.
  • For a shorter solution to your problem (that may even fit on one screen), have a look at the sublist method.
arne.b
  • 4,212
  • 2
  • 25
  • 44
  • I managed to get it working using HashMap http://pastebin.com/chtJMeW3 But still i used System.exit() to stop processing when subsegment doesn't exist in input string instead of using another flag variable.I searched on it and found that access to it can lead to DOS attacks in JEE applications but why is it a bad practice in general? – rakesh99 Nov 12 '12 at 15:01