0

I have the below code, and I am wondering if there is a more efficient and faster way of doing something like this for scaling purposes. This is a monitoring program, that checks for files and if it doesn't find the record already in the database, then it inserts it. But as we add on more and more directories to monitor I'm wondering about performance issues. Is there a faster way to check then using the an if not exists followed by an insert statement all in one sql command? Thank you for any suggestions.

SqlCommand cmd = new SqlCommand("if not exists (select File_Name from [table].[dbo].[tablename] " +
    "where Workflow_Step_Name=@iWorkflow_Step_Name and File_Name=@latestfile) INSERT INTO " +
    "[table].[dbo].[tablename] (Workflow_Name,Workflow_Step_Name,Creation_Date,File_Name,File_Path,File_Size,Insert_TimeStamp,Workflow_Id,Batch_Id) " +
    "Values(@iWorkflow_Name,@iWorkflow_Step_Name,@lastUpdated,@latestfile,@iFolder_Path,@fileSize,@timestamp,@iworkflow_Id,@batch_id)");

cmd.Connection = myConnection;
cmd.Parameters.AddWithValue("@iWorkflow_Name", i.Workflow_Name);
cmd.Parameters.AddWithValue("@iWorkflow_Step_Name", i.Workflow_Step_Name);
cmd.Parameters.AddWithValue("@lastUpdated", lastupdated);
cmd.Parameters.AddWithValue("@latestfile", latestfile);
cmd.Parameters.AddWithValue("@iFolder_Path", i.Folder_Path);
cmd.Parameters.AddWithValue("@fileSize", fileSize);
cmd.Parameters.AddWithValue("@timestamp", DateTime.UtcNow);
cmd.Parameters.AddWithValue("@iworkflow_Id", i.Workflow_Id);
cmd.Parameters.AddWithValue("@batch_id", batchId);

myConnection.Open();
cmd.ExecuteNonQuery();

myConnection.Close();
Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
zsebacher
  • 11
  • 1

1 Answers1

1

This is a clear example of anti-pattern.

You shouldn't enforce uniqueness by searching for similar rows and then inserting, since it's always subject to multi-user/multi-thread issues. You cannot make it error proof and will always be unsafe. Another user may insert a row in the millisecond it takes between the check and the insert.

Enforce uniqueness by adding a database constraint, as in:

alter table [table].[dbo].[tablename] add constraint unique_row
  unique (col1, col2, col3, ... colN);

This way the database takes care of the check.

Your uniqueness query seems to be based on two columns: Workflow_Step_Name and File_Name. If this is the case, then the following constraint will do what you need:

alter table [table].[dbo].[tablename] add constraint unique_file_name
  unique (Workflow_Step_Name, File_Name);

Then you just insert without prechecking: the database will perform the check for you. If there's already another row with those exact same values, the INSERT will fail.

Now, if your application is a single user application and always will be, then I guess you could execute a SELECT to check there's no row with the exact same file, and then perform the INSERT. In any case, it's a lot safer to add the constraint at the database level, to protect the data in case your program has a bug.

The Impaler
  • 45,731
  • 9
  • 39
  • 76
  • 1
    I was going to suggest MERGE INTO, I didn't realize they didn't need an UPDATE until I reread the question. – Marc Bernier Sep 06 '18 at 20:08
  • I'm searching for something specific (if this file doesn't exist already then insert it) not similar, so I'm a little confused there. No other user will be inserting anything, this is the only program that runs on this table. – zsebacher Sep 06 '18 at 20:18
  • Edited as requested. – The Impaler Sep 06 '18 at 20:29
  • I see where you're coming from now, but unfortunately this is not the entire case. We have some monitors that will have duplicates inserted (this is branched off by an if statement before this area of code). So only in instances where allowing duplicates is false do we check for this uniqueness. – zsebacher Sep 06 '18 at 20:34
  • Ah, I see. Then you need to perform the check programmatically. – The Impaler Sep 06 '18 at 20:35