3

I am using Entity Framework version 4. I need to compare a large (~1 million record) SQL Server table to a longish (~2000) array of complex objects returned from a web service. Five different properties need to be compared to determine whether an instance of the complex object is already in the database.

I created a function that returns an expression for use in .Where and .Any methods. It looks like this (where A is the complex object, and tblA is the EF class):

function Expression<tblA, bool> GetSearchPredicate(A a)
{
    return ta => ta.Field1.Equals(a.Field1) 
        && ta.Field2.Equals(a.Field2)
        && ta.Field3.Equals(a.Field3)
        && ta.Field4.Equals(a.Field4)
        && ta.Field5.Equals(a.Field5);
}

This works. And I can compare all 2000 instances of A by doing this:

IEnumerable<A> objects = [web service call]; 
var result = objects.Select(a => !db.tblA.Any(GetSearchPredicate(a)));

That works, too. But it's slow. So I looked into building a utility method that could build an expression that could be transmitted down to the database directly through EF.

I used the code in this question as a basis for building that utility method. The example in that question shows comparing a single property to a series of constants, while my version would have to compare multiple properties to multiple constants. My best effort is below:

    public static IQueryable<TEntity> WhereIn<TEntity>
       (
        this ObjectQuery<TEntity> query,
        IEnumerable<Expression<Func<TEntity, bool>>> predicates
       )
    {
        if (predicates == null) throw new ArgumentNullException("predicates");

        IEnumerable<ParameterExpression> p = predicates.Select(pred => pred.Parameters.Single()).ToArray();

        IEnumerable<Expression> equals = predicates.Select(value =>
            (Expression)value.Body);

        Expression bigEqual = equals.Aggregate((accumulate, equal) =>
            Expression.Or(accumulate, equal));

        var result1 = Expression.Lambda<Func<TEntity, bool>>(bigEqual, p.First());
        var result = query.Where(result1);
        return result;
    }

This would be invoked like this:

IEnumerable<A> objects = [web service call]; 
var result = db.tblA.WhereIn(objects.Select(a => GetSearchPredicate(a)));

What I get is a message saying that "ta" (the placeholder for the TEntity object) is not bound. I thought this was because I had multiple expressions (the variable predicates) being combined, and maybe this message was being thrown because I was only passing the parameter from the first of the predicates IEnumerable. But this happens even if predicates is one expression long.

I am reasonably sure, based on the method I linked to, that I could build an expression comparing each of the five properties to a constant (the values of A.Field1 through A.Field5), rather than passing in the parameter predicates that already has them assembled into a series of expressions. But I would rather not, since that would require my method to know that it's working with types A and tblA, and that's the opposite of generic and general-purpose. (It'd also be complex and messy.)

I hope the examples I've shown explain what I want to do. Can it be done in a generic way?

Community
  • 1
  • 1
Ann L.
  • 13,760
  • 5
  • 35
  • 66
  • At what point do you get the message? – Greg Bair Dec 19 '12 at 14:20
  • @GregBair When I attempt to execute the result of the function. Sorry, I left that part out! – Ann L. Dec 19 '12 at 14:23
  • Not trying to be unhelpful, but this would be pretty trivial to do (and very fast) with a stored procedure which you could import into your EF model. Do you have a particular reason not to do it that way? – Steve Wilkes Dec 19 '12 at 14:43
  • @SteveWilkes Thanks for your response. Were you visualizing 2000 calls of a stored procedure that receives 5 parameters, or a stored procedure that took an XML parameter containing all 2000 instances? The former didn't strike me as much faster, and the latter ... well, I've been considering it. But I do want to know whether the kind of combining of expressions I want to do is possible anyway, just for my own knowledge. – Ann L. Dec 19 '12 at 14:47
  • I'd imagine the latter approach, yeah. Fair enough with regards to knowledge, I'd just want to be as lazy as possible and follow the path of least resistance :) – Steve Wilkes Dec 19 '12 at 14:53

1 Answers1

3

You will need to replace the parameter in the predicate bodies with a single parameter. Something like this should work:

public static Expression<Func<T, bool>> BuildOr<T>(
   IEnumerable<Expression<Func<T, bool>>> predicates)
{
    Expression body = null;
    ParameterExpression p = null;
    Expression<Func<T, bool>> first = null;

    foreach (Expression<Func<T, bool>> item in predicates)
    {
        if (first == null)
        {
            first = item;
        }
        else
        {
            if (body == null)
            {
                body = first.Body;
                p = first.Parameters[0];
            }

            var toReplace = item.Parameters[0];
            var itemBody = ReplacementVisitor.Transform(item, toReplace, p);
            body = Expression.OrElse(body, itemBody);
        }
    }

    if (first == null) 
    {
       throw new ArgumentException("Sequence contains no elements.", "predicates");
    }

    return (body == null) ? first : Expression.Lambda<Func<T, bool>>(body, p);
}

private sealed class ReplacementVisitor : ExpressionVisitor
{
    private IList<ParameterExpression> SourceParameters { get; set; }
    private Expression ToFind { get; set; }
    private Expression ReplaceWith { get; set; }

    public static Expression Transform(
       LambdaExpression source, 
       Expression toFind, 
       Expression replaceWith)
    {
        var visitor = new ReplacementVisitor
        {
            SourceParameters = source.Parameters,
            ToFind = toFind,
            ReplaceWith = replaceWith,
        };

        return visitor.Visit(source.Body);
    }

    private Expression ReplaceNode(Expression node)
    {
        return (node == ToFind) ? ReplaceWith : node;
    }

    protected override Expression VisitConstant(ConstantExpression node)
    {
        return ReplaceNode(node);
    }

    protected override Expression VisitBinary(BinaryExpression node)
    {
        var result = ReplaceNode(node);
        if (result == node) result = base.VisitBinary(node);
        return result;
    }

    protected override Expression VisitParameter(ParameterExpression node)
    {
        if (SourceParameters.Contains(node)) return ReplaceNode(node);
        return SourceParameters.FirstOrDefault(p => p.Name == node.Name) ?? node;
    }
}

Your WhereIn method then becomes:

public static IQueryable<TEntity> WhereIn<TEntity>(
   this ObjectQuery<TEntity> query, 
   IEnumerable<Expression<Func<TEntity, bool>>> predicates)
{
    if (predicates == null) throw new ArgumentNullException("predicates");

    var predicate = BuildOr(predicates);
    return query.Where(predicate);
}
Richard Deeming
  • 29,830
  • 10
  • 79
  • 151
  • WOW. Awesome! I will try this immediately! – Ann L. Dec 19 '12 at 15:02
  • It turns out that doing it the way I was trying results in many, many parameters to the RPC call, probably (I would guess) from all the instances of `A` whose property values I was passing. But this solved the problem I was trying to solve. Thank you very much! – Ann L. Dec 19 '12 at 16:02
  • It seems like we should be able to aggregate the expressions into a single expression using LINQ Aggregate. Any thoughts on that, Richard? – pomeroy Oct 06 '16 at 14:17
  • 2
    @pomeroy: Unfortunately not. You need the visitor to replace the parameter from the subsequent expressions with the parameter from the first expression. Otherwise, you'll get an `InvalidOperationException` when you try to use the expression. *(variable 'a' of type 'type' referenced from scope '', but it is not defined)* – Richard Deeming Oct 06 '16 at 15:04