Goal:
I would like to implement a function which has a sequence of inputs "X1,...,Xn" and outputs an ordered list "Xp,..,Xq" where all the elements are distinct but ordered.
Requirement:
- For every Xi in the sequence "X1,...,Xn", it is a 256 bits long string.
- The sequence of inputs "X1,...,Xn" may have the same elements, which means that there may exist two elements Xi and Xj to satisfy Xi=Xj.
- For the same elements in sequence of "X1,...,Xn", we only output one element in the ordered list.
- The speed of function should be as fast as possible. And it does not matter that how much storage volume is used in function.
- The size of sequence "X1,...,Xn" is n, and n is a number that no more than 10,000.
My idea:
I use an Array to store the sequence which is initially empty.
When inputting Xi, first I search the Hashtable to judge if Xi is already in the above array. If yes, just dropping it. And if not, add Xi to Hashtable and Array.
If having inputting all the element of the sequence "X1,...,Xn", I sort the array and output it.
Question:
- And with Merkle Patricia Tree (Ethereum) and Hashtable, which one should I choose?
- For Merkle Patricia Tree (Ethereum) and Hashtable, which one has faster search speed?
- Or is there a better data structure to satisfy this function?