I have written a piece of lab processing software that takes incoming data, breaks it up, and sends it to the proper fields in a database. The DB table is really big and there are a lot of fields. Before I insert the data, I do a dupe check (we get repeat data from multiple sources) before inserting it.
The system works by parsing the incoming data, filling a linq-to-sql object. The dupe check works by performing a .where operation on a list of linq-to-sql objects to selects objects not yet in the table.
For instance ...
input=list (of TableA) 'linq to sql objects
output=input.where(function (x as TableA) not myDb.TableA.any(function(l as table) l.name=x.name, l.dob=x.dob..etc for 10 fields..).tolist
The syntax is a little arcane, but this was the only way I could find to perform an inner join on linq to sql objects and database records. Before I used this method, I went row-by-row performing a dupe check, which was way slower.
As I understand it, Linq is converting this linq statement to a sql statement that is running on the server.
My question is: is there any way to get this to run any faster? is there any reason to expect that writing out a sql statement to do the deduping and running a traditional query would go any faster? This statement is SLOW but it works, is unit tested and blocks the dupes. I am looking for something that is faster and equivalently clean (not hard, I know) or cleaner...