1

I have to manipulate strings present under a big HashSet Object and I would like to know if there is any possibility of manipulating existing HashSet object with out creating a new HashSet object

Below is my current logic, in which, I wanted to avoid the creation of 2nd HashSet (set2 object).

HashSet<String> set1 = new HashSet<String>();
set1.add("AB.12");
set1.add("CD.34");
set1.add("EF.56");
set1.add("GH.78");

HashSet<String> set2 = new HashSet<String>();
for(String data : set1) {
    set2.add(data.substring(0,2));
}

P.S.: The reason for going for a Set is to avoid duplicate entries (come from different source).

Vasu
  • 21,832
  • 11
  • 51
  • 67
  • 2
    what manipulation you intent to do ? – Stelium Sep 25 '15 at 14:33
  • "AB.12" and "AB" are two different object, do you want to keep the original object? – YoungHobbit Sep 25 '15 at 14:35
  • I wanted to keep "AB" into my set. – Vasu Sep 25 '15 at 14:39
  • One approach is to store not a plain `String`, but a _data structure_ like `{AB.12, int fromIndex, int toIndex}` in your original `HashSet`. – Victor Sorokin Sep 25 '15 at 14:39
  • Is there a specific reason to avoid the creation of the second hashset? If it is just a memory issue then the answer by @KishoreReddy will address that concern. Otherwise with non-list collections there is no way to both insert and remove elements in the same process. – markbernard Sep 25 '15 at 15:14

4 Answers4

3

Elements in a HashSet should not be altered to maintain the integrity of the set. You can think of them as immutable elements.

And since the order of a HashSet is not defined or constant over time, it won't be possible to add and remove elements simultaneously without potentially colliding with new elements.

However, if you're trying to conserve memory, you can simultaneously remove from set1 and add to set2.

Iterator iter = set1.iterator();

while(iter.hasNext()){
    set2.add(iter.next().substring(0, 2));
    iterator.remove();
}
elizzmc
  • 125
  • 9
  • Will the memory be re-claimed immediately after removing the elements? or will he have to wait for GC to kick in? – James Wierzba Sep 25 '15 at 14:59
  • 2
    *"Elements in a HashSet are immutable."* No, not necessarily. – Tom Sep 25 '15 at 15:00
  • You'll have to wait for GC to kick it. It'll be reclaimed ASAP, but not immediately. – elizzmc Sep 25 '15 at 15:00
  • 1
    @Tom *primitive types* in a HashSet are immutable. My apologies. – elizzmc Sep 25 '15 at 15:01
  • That is also wrong, because you can't place primitive types into a HashSet :P. I guess you mean what `String` is immutable and/or that `HashSet` items should be immutable to maintain a correct hash value. This would be true :). – Tom Sep 25 '15 at 15:03
  • 1
    @Tom, you are so much better with words that I am! I'll edit my answer to state that. – elizzmc Sep 25 '15 at 15:04
1

Java Set does not provide ordered access so you cannot step through one by one to do a replacement.

With a List, you could do:

for (int i = 0; i < list.count(); i++) {

    list.set(i, list.get(i).substring(0, 2));
}

Note that building a new Set object is not a problem, unless you have a massive data-set.


Another option is to use a Stack:

final Stack<String> toProcess = new Stack<>();

for (final String i : set1) {

    toProcess.push(i);
}

set1.clear();

while (!toProcess.isEmpty()) {

    set1.add(toProcess.pop().substring(0, 2));
}
sdgfsdh
  • 33,689
  • 26
  • 132
  • 245
  • I know this already, as I already have mentioned in my query, I have huge data. – Vasu Sep 25 '15 at 14:42
  • Added another option; but I still think that `List` will be easier. Sometimes you have to relax your design for efficiency reasons. – sdgfsdh Sep 25 '15 at 14:51
1

If you really thinking about wastage of memory then you can do like :

   HashSet<String> set2 = new HashSet<String>();

   Iterator<String> ite = set1.iterator();
   while(ite.hasNext()){
         String data = ite.next();
         ite.remove(); //look this
        set2.add(data.substring(0,2));
   }
Kishore Reddy
  • 886
  • 3
  • 11
  • 19
1

There isn't a perfect solution to your problem. The issue is that a Set by definition does not allow indexing. Coupled with the fact that String is immutable, modifying the data is not feasible.

I think you should consider changing 2 things.

  1. Use a data structure that allows you to replace elements (such as a List)

2. Use a mutable datatype, possibly creating your own class as the data type element (as Victor suggested in the comments).

This will break the internals of the HashSet because the index was created by hashing the element, if the element changes, it may disappear.

James Wierzba
  • 16,176
  • 14
  • 79
  • 120
  • I would advice against #2. What would the `equals` and `hashCode` implementations be for the mutable class? – sdgfsdh Sep 25 '15 at 14:53
  • He could create a wrapper class for String and use the java implementations. Modifying the object would be as simple as re-assigning the String field. Or, he could roll his own string class of char[] (which is better since it appears he is concerned about memory usage) – James Wierzba Sep 25 '15 at 14:57
  • See http://stackoverflow.com/questions/13177124/what-happens-to-the-lookup-in-a-hashmap-or-hashset-when-the-objects-hashcode-cha – sdgfsdh Sep 25 '15 at 14:59
  • 1
    You are right. I forgot about indexing based on the hash. – James Wierzba Sep 25 '15 at 15:00
  • 1
    *"This will break the internals of the HashSet"* You are right: http://stackoverflow.com/questions/4718009/mutable-objects-and-hashcode – Tom Sep 25 '15 at 15:09