I am using Entity Framework / Azure Functions in an attempt to keep a database up to date based on this GitHub repo of csvs.
The Process
- Timer function runs and checks a if a new commit is available in master
- If there's a new commit, upload each csv into Azure Blob Storage
- A Queue Trigger will be fired on each blob upload. Read the CSV using CsvHelper and upload it to the Database (see code below)
Objective
I want to improve step 4. Instead of completely dropping the table then recreating everything, I want to do some sort of a diff and then upsert (or delete) from there.
Code:
[FunctionName("CsvQueueProcessor")]
public async void Run([QueueTrigger("csvqueue", Connection = "AzureWebJobsStorage")]string myQueueItem, ILogger log)
{
var context = new ChadwickDbContext(Environment.GetEnvironmentVariable("DefaultConnectionString"));
var csvEntity = JsonConvert.DeserializeObject<CsvProcessorEntity>(myQueueItem);
if (csvEntity.FileName.ToLower() != "allstarfull.csv") return;
log.LogInformation($"About to get the file from blob storage: {csvEntity.FileName}");
var container = CloudBlobClient.GetContainerReference("csvs");
await container.CreateIfNotExistsAsync();
var blob = container.GetBlobReference(csvEntity.FileName);
var stream = await blob.OpenReadAsync();
var csv = new CsvReader(new StreamReader(stream));
while(csv.Read())
{
var record = csv.GetRecord<AllStarFull>();
context.AllStarFull.Add(record);
}
await context.SaveChangesAsync();
}
Everything works great! But, how would I go about diffing? Are there any entity framework features that allow some kind of comparison or 'no action' if a record is already present? I ran across this thread, but I don't think it's quite what I want.
public class AllStarFull
{
[Ignore]
public int Id { get; set; }
[Name("playerID")]
public string PlayerId {get;set;}
[Name("yearID")]
public int YearId {get;set;}
[Name("gameNum")]
public int? GameNumber {get;set;}
[Name("teamID")]
public string TeamId {get;set;}
[Name("lgID")]
public string LeagueId {get;set;}
[Name("GP")]
public int? PlayedInGame { get; set; }
[Name("startingPos")]
public int? StartingPosition { get; set; }
}