I have a class that stores values and detects if that set of values are distinct.
public class TextRecords
{
public TextRecords()
{
Count = 0;
TextInstanceDictionary = new Dictionary<string, int>();
}
public int Count
{
get;
set;
}
public Dictionary<string, int> TextInstanceDictionary
{
get;
set;
}
public void AddOrUpdateTextInstanceDictionary(string theText)
{
if (!TextInstanceDictionary.ContainsKey(theText))
{
TextInstanceDictionary.Add(theText, 1);
}
else
{
TextInstanceDictionary[theText] += 1;
}
}
public bool AllValuesAreDistinct
{
get
{
return !TextInstanceDictionary.Any(kv => kv.Value > 1);
}
}
}
This works fine for small sets of values but is not scale in terms of memory usage and performance.
Is there a way to detect if a set of values are unique without storing them all in memory as I am doing in the approach above?
I am looking for reasonable small memory footprint while retaining a good level of speed.
I am aware of Bloom filters and read this answer. Is there any other methods to solve this very specific problem?
(Note: I have also reviewed this answer but this is a different problem. I am streaming in the values one by one so just need to know if I have or haven't seen that value before. The other answer is where you are presented with a fully populated collection and asked if the values are distinct).