-2

I have to sets of strings

set 1:

    "hello"
    "world"
    "stackoverflow"

set 2:
    "world"
    "hello"
    "stackoverflow"

Before I tried to compare the content, I know exactly that these two sets contain only unique values. So I am not thinking about java Set for unique test.

So in Java, what should be the cheapest way to compare these two sets? By cheapest, I mean memory like.

I know I can do ArrayList.contains() forLoop, is there a better way?

And I was told Java HashSet consumes 5 times more resources than ArrayList when containing same length of contents. Is that true?

UPDATE

I don't have sample for you, since this is just an idea came to my mind.

By two sets of strings, I meant literally set, this set can also be stored in Java ArrayList.

what I want

is to compare these two sets of string to know if they are containing the same contents. of course I know before the actions that they both containing unique contents.

UPDATE

Sorry, this is not a practical problem I ran across with. This is just an idea I am wondering about.

cinqS
  • 1,175
  • 4
  • 12
  • 32
  • what are you trying to accomplish, are you trying to combined 2 sets to make a unique set? can you provide sample java of what you have so far – Jpsh Apr 01 '17 at 03:24
  • 2
    I'm a bit confused. You said you have two *sets* but you are not thinking about using *set*? Maybe I'm misreading that. – domsson Apr 01 '17 at 03:27
  • What do you mean by "compare"? Are you asking for `set1.equals(set2)`? – 4castle Apr 01 '17 at 03:28
  • 1
    Unless you are going to be storing sets containing hundreds of thousands of strings, the amount of memory is irrelevant. Use the simplest code possible, which will be easier to read and debug. – Jim Garrison Apr 01 '17 at 03:29
  • Are you actually experiencing memory issues? Do not pre-optimize. – 4castle Apr 01 '17 at 03:29
  • @4castle, Hi, what do you mean by *pre-optimize* please? – cinqS Apr 01 '17 at 03:34
  • 1
    I'm referring to [Premature Optimization](http://wiki.c2.com/?PrematureOptimization). (Sorry, got the terminology slightly off) – 4castle Apr 01 '17 at 03:35
  • 1
    Don't optimize before measuring. (Note that it does not mean you should pick worst possible option to start with, but rather start with reasonable approach, measure and optimize what is not meeting your goals) – Alexei Levenkov Apr 01 '17 at 03:35
  • @AlexeiLevenkov, by *Don't optimize before measuring*, do you mean: I should write the most common codes, and only optimize these code when I have problem with them and I have to measure them before optimizing? – cinqS Apr 01 '17 at 03:43
  • Go program something, quit asking hypotheticals. The best way to find a solution to a problem is to have a real problem to begin with. – Nick Ziebert Apr 01 '17 at 04:02
  • cinqS yes, write reasonable code, set goals, measure and optimize as needed. As @NickZiebert pointing out real working code is much more valuable than hypothetical one. – Alexei Levenkov Apr 01 '17 at 04:46

2 Answers2

1

I have absolutely no idea about the performance of this, but just to add a possible solution...

String[] a = {"hello","world","stackoverflow"};
String[] b = {"world","hello","stackoverflow"};
Arrays.sort(a);
Arrays.sort(b);
System.out.println(Arrays.equals(a,b) ? "same" : "different");

Result:

same
domsson
  • 4,553
  • 2
  • 22
  • 40
  • Two objects can have the same hashCode but still be unequal. Using hashCodes to test equality is a very bad idea. – VGR Apr 01 '17 at 03:50
  • You are right, @VGR. Also, why use `hashCode` when there is `equals`? Editing right away, thanks. – domsson Apr 01 '17 at 03:51
  • Sorting is usually done with qsort which requires stack and in turn needs memory to complete (O(n) on average - http://stackoverflow.com/questions/12573330/why-does-quicksort-use-ologn-extra-space). But indeed this approach can be used with in-place sort algorithms that would not require extra space. – Alexei Levenkov Apr 01 '17 at 05:02
0

Since your only concern is memory footprint than basic nested loop on arrays (or whatever collections these strings are stored in at the moment) would work perfectly fine and only use fixed amount of extra space (for loop counters).

You can shorten code by using contains too - that function should require fixed amount of memory, but you'd waste some on call stack frame.

Note that this will trade run-time performance of comparison for memory usage. The penalty for time performance is significant, and likely will be easily noticeable for sets of size more 10-20 items. More balanced approach of comparing the sets is covered for example in What is the fastest way to compare two sets in Java? - memory cost of appropriate data structures is generally less of a concern than run-time cost.

Community
  • 1
  • 1
Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
  • hmmm...fair enough, I like *memory cost of appropriate data structures is generally less of the concern than run-time cost*. – cinqS Apr 01 '17 at 03:41