So I've decided to create a program that does quite a few things. As a part of this program there's a section called "text tools" which takes a text file (via 1 button) and then has additional buttons that perform other functions like removing whitespace and empty lines from the file, removing duplicates and removing lines that match a certain pattern eg 123 or abc.
I'm able to import the file and print the list using a foreach loop and I believe I'm along the right lines however I need to remove duplicates. I've decided to use HashSet thanks to this thread in which it says it's the simplest and fastest method (my file will contain million of lines).
The problem is that I can't figure out just what I'm doing wrong, I've got the event handler for the button click, created a list of strings in memory, looped through each line in the file (adding it to the list) then creating another list and setting that to be the HashSet of list. (sorry if that's convoluted, it doesn't work for a reason).
I've looked at every stackoverflow question similar to this but I can't find any solution. I've also looked into HashSet in general to no avail.
Here's my code so far:
private void btnClearDuplicates_Copy_Click(object sender, RoutedEventArgs e)
{
List<string> list = new List<string>();
foreach (string line in File.ReadLines(FilePath, Encoding.UTF8))
{
list.Add(line);
}
var DuplicatesRemoved = new HashSet<String>(list);
}