8

Let's say I have following expressions on a collection:

var people = new List<Person>
{
     new Person {FullName = "Some Dude", Age = 45},
     new Person {FullName = "Another Dude", Age = 28},
     new Person {FullName = "Some Other Dude", Age = 36}
 };

var filtered = people.Where(person => person.Age > 28 && person.FullName.StartsWith("So"));
var narrowlyFiltered = people.Where(person => person.Age > 36 && person.FullName.StartsWith("Some"));

Is there a way to compare these two expressions and deduce that second expression is subset of first one on runtime? Without enumerating or anything else. I just have expressions and I am trying to find out if these expressions intersect or contains one another.

Emre Senturk
  • 345
  • 5
  • 13
  • You mean the result, or in all cases? – Willem Van Onsem Feb 17 '15 at 22:38
  • Yes, as long as you don't let Expression to turn into delegate. – Alexei Levenkov Feb 17 '15 at 22:39
  • No reasonably complex way to do that, no. However, it shouldn't be hard to detect if the results of one of them is a subset of the other. – Jonathan Wood Feb 17 '15 at 22:40
  • 2
    Yes, there is a way. But it requires some doing. Are you sure you need to do this? Isn't there a way to do it before it gets placed in the expression? – Travis J Feb 17 '15 at 22:40
  • @CommuSoft and Jonathan consider that I haven't enumerated it yet. I just have definitions for two spaces and I am trying to find out whether or not one space contains the other by definition. – Emre Senturk Feb 17 '15 at 22:43
  • @EmreSenturk: not in general. That problem is undecidable. – Willem Van Onsem Feb 17 '15 at 22:44
  • Why can't you just AND them together? – Travis J Feb 17 '15 at 22:46
  • @TravisJ what do you mean? – Emre Senturk Feb 17 '15 at 22:47
  • `var allFiltered = people.Where(person => person.Age > 28 && person.FullName.StartsWith("So")).Where(person => person.Age > 36 && person.FullName.StartsWith("Some"));` except done in a more general fashion of `var allFiltered = people.Where(filter).Where(secondFilter);` or even compact both expression trees for the filters `var allFiltered = people.Where(filter.And(secondFilter));` – Travis J Feb 17 '15 at 22:50
  • @TravisJ no that is not the point. The problem is to find out if the expression to build "filtered" contains the expression to build narrowlyFiltered. It is something like "calling .Where(item == animal) contains .Where(item == cat) over catA, catB, catC, dogA, dogB, plantA, plantB " – Emre Senturk Feb 17 '15 at 22:54

5 Answers5

5

You will have to decompose each Expression into all the possible inherited types (MethodCallExpression, ConditionalExpression, etc.), then walk each decomposition in parallel and check each possible parameters... It will be a little long to code... You can inspire yourself from ExpressionEqualityComparer

rducom
  • 7,072
  • 1
  • 25
  • 39
  • Thank you! What I was trying explain is quite like this one. – Emre Senturk Feb 17 '15 at 22:48
  • @Sharped: the algorithm you propose determines if two expressions are "*literally*" the same. If you however encapsulate one of the two expressions in a method, it will fail. – Willem Van Onsem Feb 17 '15 at 22:58
  • It's evident there's recursion in such a way to go.... For example, a MethodCallExpression have inside of it a list of Arguments, so it's evident that if you want to compare the whole both expressions, you will also have to compare sub-expressions too... – rducom Feb 17 '15 at 23:04
  • 1
    @Sharped: well from the moment loops occur in the call inside the expressions, problems start to occur. The problem here stated is a nice example of [*Rice's theorem*](http://en.wikipedia.org/wiki/Rice%27s_theorem). – Willem Van Onsem Feb 17 '15 at 23:07
  • I think you don't distinct two things : the expression itself, and the result of the expression. Saying expression A is a subset of expression B, is just saying that. No more. When you wan to test if an expression is the subset of another, it does not mean you want to infer the result from the expressions... It just means both expression share a part of code. That's all. – rducom Feb 17 '15 at 23:10
  • For example, it can be an excellent way to examine c# projects, in order to find if there are things than can be refactored over hundreds of Linq queries ... And it does not means you want to infer the result of the queries... – rducom Feb 17 '15 at 23:12
  • So in your opinion, the result should be `true` for the expressions `x => x.Age < 20` and `x => x.Age > 50` because they both access `Age`? – Willem Van Onsem Feb 17 '15 at 23:15
  • @CommuSoft I think it shows whether these two expressions are on the same domain, hence comparable. – Emre Senturk Feb 17 '15 at 23:26
  • No, the result of an IsSubsetOfExpression() method will be the common parts of two Expressions, and of course not true or false.... It depends on what you want to do with what you analyses. Imagine, for a refactoring purpose, you don't care about the operators < or >, you don't care about the fixed value, nor the parameters names... It depends of what you want to do with it... – rducom Feb 17 '15 at 23:26
  • @Sharped: but in this case, there are no common parts. Because both parts of the `&&` are different in arguments (string and int literals). – Willem Van Onsem Feb 17 '15 at 23:31
  • 1
    `person.Age` => MemberExpression in both cases, targeting the same property on the same type. `person.Age > 28` and `person.Age > 36` are both BinaryExpression with both the same Left operand... etc... There are lot of thing in common. – rducom Feb 17 '15 at 23:38
  • @Sharped: sure, but that's *an intersection* of elements, not *a subset*. – Willem Van Onsem Feb 17 '15 at 23:39
  • It's not an intersection since it's a B-Tree... Are you trolling ? – rducom Feb 17 '15 at 23:40
  • Stop fighting you two or I will delete my question. – Emre Senturk Feb 17 '15 at 23:42
  • @EmreSenturk: I'm not fighting, and from the moment the question is answered, it is hard to delete it anyway ;). – Willem Van Onsem Feb 17 '15 at 23:42
  • @Sharped: then (a) how do *you* define the subset of two B-trees? And (b) how does your answer differ from the [intersect](http://judge.u-aizu.ac.jp/onlinejudge/description.jsp?id=1001) defined by the link? I assume you use the syntax tree, so the tree looks like `(&& (> x.Age 28) (starsWith (person.FullName) ("So")))` – Willem Van Onsem Feb 17 '15 at 23:44
  • 2
    what I mean by subsets, is the small parts of common nodes between both expressions tree, in a "nearly equal" comparison (you can consider two node are nearly equals if they are of the same type, but not have the exactly same parameters). It have nothing to do with strict B-Tree algebra. With this notion of "nearly equality", yes we can speak of intersection – rducom Feb 17 '15 at 23:52
  • @CommuSoft: I think what Sharped meant is finding and grouping the expressions where apples are compared with apples and oranges are compared with oranges, then comparing the expressions within the groups. – Emre Senturk Feb 17 '15 at 23:54
  • @EmreSenturk: yeah, now it is clear what it meants. And don't get me wrong, this is indeed valuable. However the answer seemed to suggest one can reason whether the first expression will for instance always return `true` if the second does. With this discussion, it is clear what the contribution is. Thanks to everyone. – Willem Van Onsem Feb 17 '15 at 23:57
  • @CommuSoft: Thank you for the all the insight you gave. – Emre Senturk Feb 18 '15 at 00:01
  • @CommuSoft, once again, the result is not true or false ! It's an IEnumerable ! – rducom Feb 18 '15 at 00:01
  • @EmreSenturk : perfectly resumed :) To deduce that one expression is subset of first one on runtime, I think a strict approach would mean nothing. The example you gave is a good one : both have a lot of things in common, but both have enough distinctive parts to return an empty result if we compare their Expression structure, and that's not what you want... – rducom Feb 18 '15 at 00:01
  • @Sharped: I meant true on the lambda expressions. Your algorithm indeed returns a list of sub expression parts... – Willem Van Onsem Feb 18 '15 at 00:02
  • And @CommuSoft, thanks for your links about predictability, they are very interesting ! – rducom Feb 18 '15 at 00:02
  • Thank you guys, It was a great discussion. – Emre Senturk Feb 18 '15 at 00:09
2

In case you can enumerate your collections, you can first put the elements in a HashSet<T> and then run the HashSet<T>.IsSubSet on it:

HashSet<T> hs = new HashSet<T>(filtered);
HashSet<T> hs2 = new HashSet<T>(narrowlyFiltered);
hs.IsSubSetOf(hs2); //<- booleans saying true or false

Otherwise, this problems is an undecidable problem in general. Although there are heuristics that can work for many cases. You could for instance try to use code contracts that aim to deduce this at compile time.

Proof:

The formal variant is: given two Turing machines (methods, delegates, pointers), does every string contained in the first language is contained in the second?
Undecidable
proof: given it was decidable, EQTM would be decidable: simply first validate whether the first Turing machine is a subset of the second and vice versa. If both are subsets, we know they accept the same set of strings.

In other words, if you could do that, you could also deduce if two functions produce the same result, which cannot be implemented.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • Don't say negative things like "Undecidable" :) Anyway, thank you for your time to explain why this cannot be done. – Emre Senturk Feb 17 '15 at 22:57
  • 2
    @EmreSenturk: well in every textbook about computability, the term "undecidable" is used. And unfortunately, it is neither "semi-decidable" :( – Willem Van Onsem Feb 17 '15 at 22:59
2

It all depends how you weight what is equal, what is more important when comparing expressions etc. For example if you have completely different filter than you won't possibly know query difference before actually executing it.

To keep full control over your comparison create a filter class with some properties which can be used for filtering and then build expressions and compare using this class instead of using visitors. You can prepare common function for comparing ints, int pairs (for ranges) etc.

I did not check the code below but it should be a good start.

public class PersonFilter:  IComparable<PersonFilter>
{
    public int? MinAge { get; set; }

    public int? MaxAge { get; set; }

    public string NamePrefix { get; set; }

    public Expression<Predicate<Person>> Filter
    {
        return people => people.Where(person => (!MinAge.HasValue || person.Age > MinAge.Value) && 
            (!MaxAge.HasValue || person.Age < MaxAge.Value) && 
            (string.IsNullOrEmpty(NamePrefix) || person.FullName.StartsWith(NamePrefix))
    }

    // -1 if this filter is filtering more than the other
    public int CompareTo(PersonFilter other)
    {
        var balance = 0; // equal
        if(MinAge.HasValue != other.MinAge.HasValue)
        {
            balance += MinAge.HasValue ? -1 : 1;
        }
        else if(MinAge.HasValue)
        {
            balance += MinAge.Value.CompareTo(other.MinAge.Value) ?
        }
        if(string.IsNullOrEmpty(NamePrefix) != string.IsNullOrEmpty(other.NamePrefix))
        {
            balance += string.IsNullOrEmpty(NamePrefix) ? -1 : 1;
        }
        else if(!string.IsNullOrEmpty(NamePrefix))
        {
            if(NamePrefix.StartsWith(other.NamePrefix))
            {
                balance -= 1;
            }
            else if(other.NamePrefix.StartsWith(NamePrefix))
            {
                balance += 1;
            }
            else
            {
                // if NamePrefix is the same or completely different let's assume both filters are equal
            }
        }
        return balance;
    }

    public bool IsSubsetOf(PersonFilter other)
    {
        if(MinAge.HasValue != other.MinAge.HasValue)
        {
            if(other.MinAge.HasValue)
            {
                return false;
            }
        }
        else if(MinAge.HasValue && MinAge.Value < other.MinAge.Value)
        {
            return false;
        }

        if(string.IsNullOrEmpty(NamePrefix) != string.IsNullOrEmpty(other.NamePrefix))
        {
            if(!string.IsNullOrEmpty(other.NamePrefix))
            {
                return false;
            }
        }
        else if(!string.IsNullOrEmpty(NamePrefix))
        {
            if(!NamePrefix.StartsWith(other.NamePrefix))
            {
            return false;
            }
        }

        return true;
    }
}
too
  • 3,009
  • 4
  • 37
  • 51
  • Thank you! I was decided to go with creating a dummy Person instance with filter values of one expression and putting it into a collection and eval. the second expression on this collection, but your proposal seems more defined. – Emre Senturk Feb 17 '15 at 23:18
  • Why do you need to compare expressions in the first place? I find it hard to find business value in that kind of comparison unless you want to check if another query is required or not assuming you already have a result of a more broad query. In that case in the above code you need to change balance calculation so that any difference requiring a new query will return 1 immediately. – too Feb 17 '15 at 23:24
  • think of this as a way to find which db to query among perfectly sharded dbs :) – Emre Senturk Feb 17 '15 at 23:28
  • In that case you will definitely come up with adequate custom comparison. Good luck :) – too Feb 17 '15 at 23:30
  • @too: the comparison with a balance is a bit problematic. You simply can't define a full order on these instances that makes sense. You should define a partial order where you either can say it is equal, greater than or less than or you can't say anything about it. – Willem Van Onsem Feb 17 '15 at 23:37
  • @CommuSoft: I completely agree yet this approach can be used for just checking if one filter is defining a subset of another - only then it can be guaranteed to give a deterministic result and adding a dedicated IsSubsetOf method would better fit this use case. – too Feb 17 '15 at 23:48
  • I added IsSubsetOf example implementation for MinAge and NamePrefix properties. – too Feb 17 '15 at 23:59
1

Have a look at the Specification design pattern

Once it's implemented then your specification in this case becomes

public class PersonNamedOlderThanSpecification : CompositeSpecification<Person>
{
    private string name;
    private int age;

    public PersonNamedOlderThanSpecification(string name, int age)
    {
        this.name = name;
        this.age = age;
    }


    public override bool IsSatisfiedBy(Person entity)
    {
        return (entity.Name.Contains(this.name)) && (entity.Age > age);
    }
}

Then you can use it like this:

var personSpecs = new PersonNamedOlderThanSpecification("So", 28);
var personSpecs2 = new PersonNamedOlderThanSpecification("Some", 36);

var filtered = people.FindAll(x => personSpecs.IsSatisfiedBy(x));
var adjusted = people.FindAll(x => personSpecs2.IsSatisfiedBy(x));
CheGueVerra
  • 7,849
  • 4
  • 37
  • 49
  • I think this doesn't really provide an answer to the question. Although you could indeed implement `IsSubSetOf` logic in the sepcification.... – Willem Van Onsem Feb 17 '15 at 23:04
  • I use this pattern for all the filtering that I do on my lists, as values are dynamic, the result list is all the entities the satisfy the specification – CheGueVerra Feb 17 '15 at 23:07
  • But you are right I misunderstood the question, focused on the filtered part – CheGueVerra Feb 17 '15 at 23:08
  • Didn't think of doing that when I created it, I will have a give it a go, Thanks ... sry the post didn't help you problem ... Read the question twice, I should hmmmm – CheGueVerra Feb 17 '15 at 23:16
  • @CheGueVerra No worries :) I am the one who is guilty by not stating the problem clearly. – Emre Senturk Feb 17 '15 at 23:22
0

You can try this:

var people = new List<Person>
{
     new Person {FullName = "Some Dude", Age = 45},
     new Person {FullName = "Another Dude", Age = 28},
     new Person {FullName = "Some Other Dude", Age = 36}
};

var filtered = people.Where(person => person.Age > 28 && person.FullName.StartsWith("So"));
var narrowlyFiltered = people.Where(person => person.Age > 36 && person.FullName.StartsWith("Some"));

var intersection = filtered.Intersect(narrowlyFiltered);
if (intersection != null)
{
    if (intersection.Count() > 0)
    {
        //narrowlyFiltered is subset of filtered
    }
}
  • 1
    Yeah, but it does enumerate it. – Emre Senturk Feb 17 '15 at 23:14
  • Ok, if you only want know if narrowlyFiltered is subset of filtered (true or false), in this [post](http://stackoverflow.com/questions/11092930/check-if-listt-contains-any-of-another-list), explain the solution; in your case, you do this: bool subset = filtered.Any(v => narrowlyFiltered.Any()); – Carlos Iván Montes Agüero Feb 17 '15 at 23:28