0

EntityFramework v6.0

IBM.Data.DB2.EntityFramework v10.5

.NET Framework v4 / C# v4.0 (yes, working on upgrading)

Example:

List<long> myIds = new List<long>{ 12345, 23456, 34567 };

IQueryable<myModel> myCollection = (from a in context.TBL_ONE
                                    join b in context.TBL_TWO on a.ENTY equals b.ENTY
                                    select new myModel()
                                    {
                                         LINE_ID = a.LINE_ID
                                    }).AsQueryable();

var results = myCollection.Where(x => myIds.Contains(x.LINE_ID)).ToList(); // here is the issue...

The "myCollection" will be very large, and so will the "myIds" collection.

So would prefer it to compile to

    SELECT ... WHERE LINE_ID IN (12345, 23456, 34567);

however, it compiles to

    SELECT ... WHERE LINE_ID = 12345 OR LINE_ID = 23456 OR LINE_ID = 34567;

I have tried HashSet, List, and Array - as well as large or small, they all give the same compiled style. Is this due to outdated linq or .net version?

Please let me know if I can provide more information or formatting, I am still a beginner. Thank you

Wayne W
  • 3
  • 2
  • Does DB2 support Table Valued Parameters or temp tables? – Charlieface Apr 20 '23 at 16:50
  • Add Where like this : from a in context.TBL_ONE.Where(x => IDs.Contains(x.LINE_ID)) You can make a variable : int[] IDs = {12345, 23456, 34567); – jdweng Apr 20 '23 at 16:51
  • Have you tried feeding a larger collection? Perhaps there are optimizations? Also `in` clause has its own limitations, for SQL servers around 2100. – Neistow Apr 20 '23 at 16:52
  • @Charlieface i believe so, in stored procedures I can create "global temporary tables" to access during a session... – Wayne W Apr 20 '23 at 16:52
  • @jdweng will try this and let you know – Wayne W Apr 20 '23 at 16:53
  • Sounds like you should upload the list into such a table and join it. That will almost certainly be the fastest. – Charlieface Apr 20 '23 at 16:54
  • @Neistow yes, currently my goal is to use a "filter list" of ~50k-100k id's and I did try 50k as a quick test. unfortunately it came up with a very ugly massive WHERE OR for all the ids. EDIT: i just saw you say that the limit is 2100... oops, that would prevent this strategy for me. – Wayne W Apr 20 '23 at 16:55
  • @Charlieface that would definitely be how I'd do it on stored procedures... i honestly didn't think that'd be possible through Linq. Will try this too, thanks! – Wayne W Apr 20 '23 at 16:55
  • @WayneW if your goal is 50-100k then `where in` clause won't help you. Your database will just return error. – Neistow Apr 20 '23 at 16:58
  • @WayneW my suggestion is to leverage temp tables. Insert IDs into a temp table and then do a join with the table you want to query. `SqlBulkCopy` is a good way to populate your temp table – Neistow Apr 20 '23 at 17:00
  • i'm thinking that this might be the best answer that I'm going to get, would you please submit an answer so I can accept? @Neistow – Wayne W Apr 20 '23 at 17:12
  • `SqlBulkCopy` only works on SQL Server not DB2. One solution I've seen is to have a table which stores these IDs, and you insert along with a fixed `BatchID` then join based on that `BatchID`, afterwards you delete those rows. If you find a fast insert method it should be quick. – Charlieface Apr 20 '23 at 19:00
  • @Charlieface you are correct on the first part, I did find DB2BulkCopy but currently trying to figure out a way to map the datatable to entities and perform the join on the server side to avoid loading in all the entities (50k-100k records as mentioned). Your suggestion wouldn't work I don't think because the table I'm filtering on can be over 3 million records and i'd like to do all filtering on db2-side – Wayne W Apr 20 '23 at 19:27
  • Like I said, yo can bulk load into a permanent table and join on that server-side – Charlieface Apr 20 '23 at 20:32

1 Answers1

0

The in clause has limitations, so you won't be able to stuff 50k into it, you'll just get an error.

You can solve your problem using temp tables in the following way:

  1. Create a temp table
  2. Bulk/Batch insert your IDs into a temp table
  3. Join your temp table with primary

For inserting data into your table you can use SqlBulkCopy, if you're using SQL Server or find any other bulk copy solutions for your DB engine.

Neistow
  • 806
  • 1
  • 8
  • 20
  • Accepting this as the answer since it eventually led to my resolution, but for future people who might come here... I ended up having to use DB2BulkCopy and resorted to using sqlcommand and a manual connection instead of Linq-to-entities/context so that I can hold onto the connection until I was ready to lose the temp table... and the time savings were definitely worth leaving linq temporarily. – Wayne W Apr 25 '23 at 16:48