I've a huge string(1GB) with SPACE delimiter, I'll convert it into Array[]. My string contains lots of duplicates. I've to sort the string and remove duplicates. I've made up 2 procedures and I'm not able to decide one among these two.
Procedure 1
I assume that sorting string is costly process, I wanted to remove duplicates using HashSet and then sort.
Procedure 2
I sort the Array and remove duplicates using formal procedure of comparing sorted Array with its previous value to next value and remove duplicates.
From my point of view, 1st procedure seems good. But I'm not aware if I run into any errors. Which one will be good..?