I've been playing with @Cinchoo's fantastic ETL system for C#. I need to compare two CSV files, where one CSV file is defined as a dynamically growing master table and the other is a feeder "detail" table.
The detail table may have differences in terms of NEW records, CHANGED records, or a record no longer (DELETED) existing in the master CSV file.
The output should be a 3rd table that replaces or updates the master table - so it's a growing CSV file.
Both tables have unique ID columns and a header row.
MASTER CSV
ID,name
1,Danny
2,Fred
3,Sam
DETAIL
ID,name
1,Danny
<-- record no longer exists
3,Pamela <-- name change
4,Fernando <-- new record
So far I've been referring to this fiddle, and the code below:
using System;
using ChoETL;
using System.Linq;
public class Program
{
public static void Main()
{
var input1 = ChoCSVReader.LoadText(csv1).WithFirstLineHeader().ToArray();
var input2 = ChoCSVReader.LoadText(csv2).WithFirstLineHeader().ToArray();
Console.WriteLine("NEW records\n");
using (var output = new ChoCSVWriter(Console.Out).WithFirstLineHeader())
{
output.Write(input2.OfType<ChoDynamicObject>().Except(input1.OfType<ChoDynamicObject>(),
new ChoDynamicObjectEqualityComparer(new string[] { "id" })));
}
Console.WriteLine("\n\nDELETED records\n");
using (var output = new ChoCSVWriter(Console.Out).WithFirstLineHeader())
{
output.Write(input1.OfType<ChoDynamicObject>().Except(input2.OfType<ChoDynamicObject>(),
new ChoDynamicObjectEqualityComparer(new string[] { "id" })));
}
Console.WriteLine("\n\nCHANGED records\n");
using (var output = new ChoCSVWriter(Console.Out).WithFirstLineHeader())
{
output.Write(input1.OfType<ChoDynamicObject>().Except(input2.OfType<ChoDynamicObject>(),
new ChoDynamicObjectEqualityComparer(new string[] { "id", "name" })));
}
}
static string csv1 = @"
ID,name
1,Danny
2,Fred
3,Sam";
static string csv2 = @"
ID,name
1,Danny
3,Pamela
4,Fernando";
}
OUTPUT
NEW records
ID,name
4,Fernando
DELETED records
ID,name
2,Fred
CHANGED records
ID,name
2,Fred
3,Sam
The CHANGED records is not working. As an added extra, I need a status so I want it to look like this:
CHANGED records
ID,name,status
1,Danny,NOCHANGE
2,Fred,DELETED
3,Pamela,CHANGED
4,Fernando,NEW
Thanks