0

I have EF DataContext and would like to get data from SQL without getting full table from DB:

        List<(int width, int height)> dimensions = new List<(int width, int height)>();

        var res = context.DimensionData.Where(d => 
            dimensions.Any(dim => 
                dim.width == d.Width && dim.height == d.Height))
                .ToList();

Using SQL Profiler I can see it will get full table and do slow in memory search.

How can I write Query so processing is done in DB?

I assume answering problem with SQL code and going back to Linq could give an answer how to do it. It seems a temp table needs to be generated for inner join and I am not sure how or if Linq is capable of this.

UPDATE 1: For now I've manage to find a package that does in memory collections join in DB: https://www.nuget.org/packages/EntityFrameworkCore.MemoryJoin

Unfortunately project is still on .Net Core 2.2 and cant upgrade to .Net Core 3.x because have breaking changes from base packages I need to use.

UPDATE 2:

0.5.7 version of MemoryJoin package worked for me as @Ivan Stoev suggested

SELECT [d].[Id], [d].[Width], [d].[Height]
FROM [DimensionData] AS [d]
INNER JOIN (
    SELECT [x].[int1] AS [Width], [x].[int2] AS [Height]
    FROM (
        SELECT * FROM (VALUES (1, 55, 66), (2, 88, 99)) AS __gen_query_data__ (id, int1, int2)
    ) AS [x]
) AS [t] ON ([d].[Width] = [t].[Width]) AND ([d].[Height] = [t].[Height])
user007
  • 1,122
  • 1
  • 10
  • 30
  • You could send that dimensions as an XML to server side, expand there to a table and do your query with an inner join. You could either do it directly on context.ExecuteSql or create a function or stored procedure that you can call. – Cetin Basoz Mar 02 '22 at 23:20
  • I removed the linq-to-sql tag. That's not EF. Please add a tag for the EF version you use. – Gert Arnold Mar 03 '22 at 09:48
  • 2
    How about trying [0.5.7](https://www.nuget.org/packages/EntityFrameworkCore.MemoryJoin/0.5.7) version of MemoryJoin package (the last supporting EFC 2)? – Ivan Stoev Mar 03 '22 at 10:32
  • I think EF in .Net framework or core both can use the raw sql approach I presented. – Cetin Basoz Mar 03 '22 at 13:33
  • @CetinBasoz The problem with your solution is that it is not general - works only with SqlServer. How about other databases - Oracle, MySql, ProgreSQL etc. – Ivan Stoev Mar 03 '22 at 15:23
  • @IvanStoev, question was posted as specific to SQL server. Writing SP in the target database and calling it instead is a general solution then. You wouldn't want them all to be translated the same way (ie: likely in postgreSQL this would use geo types and could send an array). – Cetin Basoz Mar 03 '22 at 15:41
  • @CetinBasoz Well, the LINQ constructs which are supported by EF Core are general, and the specific translation is provided by the provider components. And do not require writing stored procedures. Also the question was mistakenly posted as `linq-to-sql` probably meaning `linq-to-entities`. I see the word SQL used, but that's not the same as SqlServer. So I'm not saying your solution is bad/not working, just it isn't the general solution of the problem. – Ivan Stoev Mar 03 '22 at 15:56
  • @IvanStoev, I didn't say the word SQL was used, it was clearly tagged as LinqToSQL. Anyway. – Cetin Basoz Mar 03 '22 at 16:06
  • AFAIK the `MemoryJoin` package is the only general attempt to solve the problem in question. Other than that, @Gert did pretty good analysis of the problem and possible solutions back in the days [here](https://stackoverflow.com/a/26201371/5202563). Even though it is for "classic" EF6, it applies to all versions of the "modern" EF Core includting the latest at this time EF Core 6.0. – Ivan Stoev Mar 03 '22 at 16:09
  • @IvanStoev, I don't know who Gert is, and I don't see myself in a position to what he may have done is the best. When I have this problem myself, instead of taking all those overheads of EF I would write my SP and call it. – Cetin Basoz Mar 03 '22 at 16:21

2 Answers2

0

I don't think LINQ can pass a compound condition like that to SQL, but you may be able to narrow the rows returned by SQL with several simpler conditions.

Try:

List<(int width, int height)> dimensions = new List<(int width, int height)>();
// (Insert values inserted into the above list)
List<int> widths = dimensions.Select(d => d.width).ToList();
List<int> heights = dimensions.Select(d => d.height).ToList();

var res = context.DimensionData
    .Where(d =>
         widths.Contains(d.Width)
         && heights.Contains(d.Height)
    )
    // (Might need .ToList() or something here to force evaluation)
    .Where(d => dimensions.Any(dim =>
         dim.width == d.Width
         && dim.height == d.Height
    ))
    .ToList();

The first Where should translate into SQL as something like:

SELECT *
FROM DimensionData d
WHERE d.Width IN (@p1, @p2, @p3, ...)
AND d.Height IN (@p4, @p5, @p6, ...)

(only much harder to read in the generated SQL)

The second .Where() will operate on the initial retrieved results to filter out the crossover cases like (width1,height2),(width3,height1).

T N
  • 4,322
  • 1
  • 5
  • 18
  • I don't think this is the same as the query that OP is doing. ie: dimensions might have {width:10, height:20} and {width:5, height:50}. OP query is looking for (10, 20) and (5,50) cases, where yours would target (10,20), (10,50), (5,20), (5,50). – Cetin Basoz Mar 02 '22 at 23:24
  • @CetinBasoz - That's why the second .Where() is still in place. My aim is to significantly reduce the amount of data initially returned from SQL server, and then let the C# side refine the results. – T N Mar 02 '22 at 23:28
  • There might be a way of dynamically building an expression tree of OR/AND/EQUALS nodes to handle an arbitrary number of dimensions items, but that would be much more involved. Someone else can take that on, as I'm a novice with that. – T N Mar 02 '22 at 23:36
  • There wouldn't be such a translation. I was in a hurry when I wrote my comment to you and to OP, now I am adding an answer if you would check why it wouldn't translate as such. – Cetin Basoz Mar 03 '22 at 09:48
  • **CORRECTION:** I just realized that my original answer mistakenly used an `.Any(...)` condition when it should have used `.Contains(...)`. I have updated the answer with the change. The latter should be recognized by LINQ to SQL and translated to the intended `IN (@p1, @p2, ...)` SQL equivalents. – T N Mar 03 '22 at 18:00
0

That block of dimensions.Any(...) is local and wouldn't be translated to a valid SQL neither in Linq To SQL nor EF. For your case, your actual code should be first doing something like .AsEnumerable(), .ToList() to make the data local so dimensions.Any() could be used on it. First making it local causes the whole data to be driven first, without any filtering.

Now to my proposed solution:

  • Create an XML representation of dimensions list.
  • Create an SQL command, which would get this XML as a parameter, expand it to a table, do the query on the original table. All this would be done on the server.

Next modeling the solution, for testing purposes let's say we have created DimensionData table in SQL server (Linq To SQL - so this is SQL server, right?) like this:

 CREATE TABLE DimensionData
  (
    id     INT IDENTITY NOT NULL PRIMARY KEY
  , Width  INT NOT NULL
  , Height INT NOT NULL
  );

WITH
  base AS (SELECT x FROM(VALUES(1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) AS t(x) )
INSERT INTO dbo.DimensionData(Width, Height)
SELECT t1.x, t2.x FROM base t1 CROSS JOIN base t2;

CREATE INDEX ix_DimWH ON dbo.DimensionData(Width, Height);

A simple test table with 100 rows just for sampling.

Now to the code with some dimensions sample added (assuming you run this in LinqPad, thus added .Dump() for simplicity):

void Main()
{
    List<(int width, int height)> dimensions = new List<(int width, int height)>();
    dimensions.Add((1, 1));
    dimensions.Add((2, 1));
    dimensions.Add((3, 1));

    string xml = new XDocument(
        new XElement("Data",
            from wh in dimensions
            select
                new XElement("row", 
                new XAttribute("Width", wh.width), 
                new XAttribute("Height", wh.height))
        )
      ).ToString();
      
    string sqlCommand = @"DECLARE @hDoc int;
EXECUTE sp_xml_preparedocument @hDoc OUTPUT, {0};
with source as
(
  SELECT width, height FROM OPENXML(@hDoc, '/Data/row', 1) with ( Width int, Height int )
)
Select * 
from DimensionData d
where exists (
   select * 
   from source s
   where d.Width = s.Width and d.Height = s.Height
); 
EXECUTE sp_xml_removedocument @hDoc;";

string cs = @"server=.\SQL2017;Database=TestDb;Trusted_Connection=yes";

// Linq To SQL
    var context = new DataContext(cs);
    var res = context.ExecuteQuery<DimensionData>(sqlCommand, xml);

// EF
//  var context = new MyContext(cs);
//  var res = context.Database.SqlQuery<DimensionData>(sqlCommand, xml);
    
    res.ToList().Dump();
}
public class DimensionData
{
    [Key]
    public int Id { get; set; }
    public int Width { get; set; }
    public int Height { get; set; }
}

public class MyContext : DbContext
{
    public MyContext(string connectionString)
       : base(connectionString)
    { }
    public DbSet<DimensionData> DimensionData { get; set; }
}

If you check generated SQL in profiler, both Linq To SQL and EF send the same SQL to server as this one:

exec sp_executesql N'DECLARE @hDoc int;
EXECUTE sp_xml_preparedocument @hDoc OUTPUT, @p0;
with source as
(
  SELECT width, height FROM OPENXML(@hDoc, ''/Data/row'', 1) with ( Width int, Height int )
)
Select * 
from DimensionData d
where exists (
   select * 
   from source s
   where d.Width = s.Width and d.Height = s.Height
); 
EXECUTE sp_xml_removedocument @hDoc;',N'@p0 nvarchar(111)',@p0=N'<Data>
  <row Width="1" Height="1" />
  <row Width="2" Height="1" />
  <row Width="3" Height="1" />
</Data>'
go

As per the XML parsing on SQL server side using sp_xml_*, those SP are available in SQL server since very old versions and work wonderfully well. You might replace that with the newer xml functions if you like (doesn't worth it IMHO).

Cetin Basoz
  • 22,495
  • 3
  • 31
  • 39