0

I've currently built a Visual Studio C# project that saves API data into a database through Entity Framework. Every time I run the project the table in the database is wiped then data is re-added again. This is to stop duplication. However, I'm wondering if there is an alternate route where I don't have to wipe the data but I can just add new data that isn't already there?

Here is the code in my project. Starting from the method in my main class that uses RestSharp to obtain the API data, deserializes to a JSON format, then saves to my DB.

public static void getAllRequestData()
{
    var client = new RestClient("[My API URL]");
    var request = new RestRequest();
    var response = client.Execute(request);

    if (response.StatusCode == System.Net.HttpStatusCode.OK)
    {
        string rawResponse = response.Content;
        AllRequests.Rootobject result = JsonConvert.DeserializeObject<AllRequests.Rootobject>(rawResponse);

        using (var db = new TransitionContext())
        {
            db.RequestDetails.RemoveRange(db.RequestDetails); //Wipes Data
            db.RequestDetails.AddRange(result.Operation.Details); //Adds Data
            db.SaveChanges(); 
        } //Utilising EF to save data to the DB
    }
} //Method that calls and stores API data

Here is the Entity Framework class below, as you can see it just supports one table (dataset).

public class TransitionContext : DbContext
{
    private const string connectionString = @"[My Server]";

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseSqlServer(connectionString);
    }

    public DbSet<AllRequests.Detail> RequestDetails { get; set; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<AllRequests.Detail>().HasKey(r => r.Id);
    }
}

And here is the DTO class, this holds the the data temrporarily and is used to be structured into a class fit for the API data.

public class AllRequests
    {
        public class Rootobject
        {
            public Operation Operation { get; set; }
        }

        public class Operation
        {
            public Result Result { get; set; }
            public Detail[] Details { get; set; }
        }

        public class Result
        {
            public string Message { get; set; }
            public string Status { get; set; }
        }

        public class Detail
        {
            [Key]
            public int Id { get; set; }
            public string Requester { get; set; }
            public string WorkOrderId { get; set; }
            public string AccountName { get; set; }
            public string CreatedBy { get; set; }
            public string Subject { get; set; }
            public string Technician { get; set; }
            public string IsOverDue { get; set; }
            public long DueByTime { get; set; }
            public string Priority { get; set; }
            public long CreatedTime { get; set; }
            public string IgnoreRequest { get; set; }
            public string Status { get; set; }
        }
    }

Here is the table that is produced (irrelevant data blocked out). enter image description here

In order to get Entity Framework working, I had to create an ID. This ID does not hold any API data, it simply starts from 1 to however many rows there are. However, WorkOrderId is a unique ID for each row. How would I be able to make this project scan for WorkOrderId's in the table and only add new data where a WorkOrderId wasn't already there?

Because ideally I want to be running this project every 5-10 minutes to keep the table constantly updated, however, at the moment I feel that wiping the table isn't the ideal way to go and is a long process. I would prefer it if I could implement this procedure instead. Any help or pointers would be greatly appreciated.

KyleAT
  • 71
  • 1
  • 8
  • And what should be done with records that were inserted in DB but don't exist in a new records batch? – Tomas Chabada Jan 25 '21 at 09:47
  • That will never happen in this scenario, all the records that are being pulled will remain there no matter what. – KyleAT Jan 25 '21 at 11:04

1 Answers1

1

Short answer - use unique constraints and configure EF.

Long answer - you should get more familiar with databases and Entity Framework. Database should ensure you that there are no duplicates.

Primary key (your Id here) is much more important than you think.


A table can have only one primary key, which may consist of single or multiple fields. When multiple fields are used as a primary key, they are called a composite key.

If a table has a primary key defined on any field(s), then you cannot have two records having the same value of that field(s).

(https://www.tutorialspoint.com/sql/sql-primary-key.htm)

What's more in some databases (SQL Server for example) you can have primary key built on several columns.

The other think is to properly configure Entity Framework. Basically you do this in OnModelCreating.

You can say here that a column should be unique, for example:

protected override void OnModelCreating(ModelBuilder builder)
{
    builder.Entity<User>()
        .HasIndex(u => u.Email)
        .IsUnique();
}

You can read more about indices here: https://en.wikipedia.org/wiki/Database_index and here: How does database indexing work?

It seems that you should make field WorkOrderId your primary key and delete Id field. I think you should also read more about foreign keys: https://www.w3schools.com/sql/sql_foreignkey.asp

Adam Jachocki
  • 1,897
  • 1
  • 12
  • 28
  • Thank you for the response, I do know how PK's and FK's work although I am quite new to Entity Framework. I've took your advice and I've got rid of the Id key and instead I've made WorkOrderId my PK. I've re-structured it all in Entity Framework and populated the table. The only issue now is that when I run it again it says "cannot insert duplicate primary key", so I can't update the table anymore with this API? – KyleAT Jan 25 '21 at 11:31
  • Because ideally, I want to be able to run this program every 10 minutes, so it adds new data, I found out how to do that by wiping the data first then re adding it again over and over. But I found that wiping the data is not the best way to go about it. I'll read about database indexing as part of your suggestion, again thank you for the assistance Adam I appreciate it. – KyleAT Jan 25 '21 at 11:33
  • It looks as if you were trying to add the same records all the time. Do not do that. – Adam Jachocki Jan 25 '21 at 15:41
  • I'm aware it's doing that but I want to find a solution for how to only add records that aren't already there if that makes sense? Basically I just want to update my table each time I start the application. – KyleAT Jan 25 '21 at 16:02
  • I don't understand your problem. Where do you take records to put into db from? – Adam Jachocki Jan 25 '21 at 19:05
  • I take them from an API that I have there in RestSharp, it is then stored in that DTO file that I showed, exactly in that format. The Request Details, I then scaffold that as a db. That whole process works, it's just that I can't update it, I can only push data to it once and that's it, if I want to add new data too it I have to wipe the data first then store it again. – KyleAT Jan 25 '21 at 21:20
  • Look at my void getAllRequestData() method, and you'll see the client where it gets API data (I haven't put the actual API link for obvious reasons), and I'm storing the API data into an SQL table in my database. – KyleAT Jan 25 '21 at 21:21
  • That's your problem. You should avoid getting ALL the data from any source. You should only get the one you need. So you should take for example last record from db, then get data fro rest after that record (id, datetime, whatever) and then just add new data. – Adam Jachocki Jan 26 '21 at 09:16
  • I thought it would be a lot easier to scan the Db first and check which data isn't already there and update. Because I'm not sure how to GET selective API data that isn't already on the Db then store it into a Db. – KyleAT Jan 26 '21 at 09:45
  • It may be easier, but worse. You have to get ALL data from api and all data from db and compare it. Straight way to hell. – Adam Jachocki Jan 26 '21 at 10:00
  • Ah okay, it sounds complicated. I'm just puzzled because I can't find any information how to rectify the problem that I have online, so I must be doing something wrong, I thought someone would have a similar problem. You see my database has to be updated twice a day, because new data comes through the API. – KyleAT Jan 26 '21 at 11:54
  • The easiest way would be if your API could give you only the new data. Then you just have to do simple insert. – Adam Jachocki Jan 26 '21 at 18:35
  • Okay thank you Adam, I'll keep that in mind going forward – KyleAT Jan 27 '21 at 08:00