I am reading in 5000 rows of data from a stream
as follows from top to bottom and store it in a new CSV file.
ProductCode |Name | Type | Price
ABC | Shoe | Trainers | 3.99
ABC | Shoe | Trainers | 4.99
ABC | Shoe | Trainers | 5.99
ABC | Shoe | Heels | 3.99
ABC | Shoe | Heels | 4.99
ABC | Shoe | Heels | 5.99
...
Instead of having duplicate entries, I want the CSV to have one row but with the Price summed:
ProductCode |Name | Type | Price
ABC | Shoe | Trainers | 14.97
ABC | Shoe | Heels | 14.97
I store each row as a Product
:
public class Product
{
public string ProductCode { get; set; }
public string Name { get; set; }
public string Type { get; set; }
public string Price { get; set; }
}
After reading the data from the stream I end up with an IEnumerable<Product>
.
My code is then:
string fileName = Path.Combine(directory, string.Format("{0}.csv", name));
var results = Parse(stream).ToList(); //Parse returns IEnumerable<Product>
if (results.Any())
{
using (var streamWriter = File.CreateText(fileName))
{
//writes the header line out
streamWriter.WriteLine("{0},{1}", header, name);
results.ForEach(p => { streamWriter.WriteLine(_parser.ConvertToOutputFormat(p)); });
streamWriter.Flush();
streamWriter.Close();
}
Optional<string> newFileName = Optional.Of(SharpZipWrapper.ZipFile(fileName, RepositoryDirectory));
//cleanup
File.Delete(fileName);
return newFileName;
}
I don't want to go through the 5000 rows again to remove the duplicates but would like to check if the entry already exists before I add it to the csv file.
What is the most efficient way to do this?