6

I have a list where each element contains two values (V1 and V2). What I need is the element with the highest V1 and highest V2 (prioritizing V1).

I have tried two approaches:

  1. OrderByDescending and ThenByDescending, then take the first element:

    list.OrderByDescending(e => e.V1).ThenByDescending(e => e.V2).First();
    
  2. Select elements with biggest V1, then select the first element with the biggest V2 from this enumerable:

    var maxV1 = l.Where(e => e.V1 == l.Max(e => e.V1));
    maxV1.First(e => e.V2 == maxV1.Max(e1 => e1.V2));
    

Both (in my use case) require a fair amount of time and I'm not satisfied with either of my solutions.

The list itself doesn't contain a lot of elements, not more than 100. But there are a lot of them.

Is there another, preferably more efficient, solution than what I've already tried? Or do I have to rethink the whole architecture?

Edit: I forgot to mention that there are more variables in each element which might be used to select the highest value. Which one is used depends on a parameter. So pre sorting using sorted collections doesn't net any benefits.

Kabbalah
  • 471
  • 1
  • 5
  • 16
  • `myList.OrderByDescending(a => a.MyProperty).OrderByDescending(b => b.MyProperty2).First()` should be sublte. – Amit Kumar Ghosh Jul 13 '15 at 08:24
  • Another possibility would be to implement an own comparer for your object which orders the objects by V1 first and V2 second and then just sort your list. – LInsoDeTeh Jul 13 '15 at 08:26
  • How about working with an already [SortedList](https://msdn.microsoft.com/library/ms132319.aspx), or inherit from `List` and have a `Max` property which gets updated on every `Add`? -- would add penalty while filling, but access would be faster. – Corak Jul 13 '15 at 08:26

3 Answers3

3

You can use GroupBy, then order this V1-group by V2:

var highestItemByV1V2 = list.GroupBy(x => x.V1)
    .OrderByDescending(g => g.Key)
    .Select(g => g.OrderByDescending(x => x.V2).First())
    .First();

You should also store the max value instead of using it as expression in the query, otherwise it will be evaulated always. So this is more efficient:

var highestV1 = list.Max(x => x.V1);
var maxObj = list.Where(x => x.V1 == highestV1).OrderByDescending(x => x.V2).First();

However, your first approach should perform well, it's simple and efficient:

list.OrderByDescending(e => e.V1).ThenByDescending(e => e.V2).First();

So what kind of performance issue do you have? Maybe you are loooking at the wrong place or you call this code too often. Consider to store them already sorted, f.e. in a SortedList. I think that a SortedDictionary is even more efficient in this case.

The SortedDictionary<TKey, TValue> generic class is a binary search tree with O(log n) retrieval, where n is the number of elements in the dictionary. In this respect, it is similar to the SortedList<TKey, TValue> generic class. The two classes have similar object models, and both have O(log n) retrieval. Where the two classes differ is in memory use and speed of insertion and removal:

  • SortedList<TKey, TValue> uses less memory than SortedDictionary<TKey, TValue>.
  • SortedDictionary<TKey, TValue> has faster insertion and removal operations for unsorted data: O(log n) as opposed to O(n) for SortedList<TKey, TValue>.
  • If the list is populated all at once from sorted data, SortedList<TKey, TValue> is faster than SortedDictionary<TKey, TValue>.

Here is a possible implementation using a SortedDictionary<double, SortedSet<Obj>>:

SortedDictionary<double, SortedSet<Obj>> sortedLookup = 
    new SortedDictionary<double, SortedSet<Obj>>(); // key is V1 and value all items with that value

internal class ObjV2Comparer : IComparer<Obj>
{
    public int Compare(Obj x, Obj y)
    {
        return x.V2.CompareTo(y.V2);
    }
}

private static readonly ObjV2Comparer V2Comparer = new ObjV2Comparer();

public void Add(Obj obj)
{
    SortedSet<Obj> set;
    bool exists = sortedLookup.TryGetValue(obj.V1, out set);
    if(!exists)
        set = new SortedSet<Obj>(V2Comparer);
    set.Add(obj);
    sortedLookup[obj.V1] = set;
}
public Obj GetMaxItem()
{
    if (sortedLookup.Count == 0) return null;
    Obj maxV1Item = sortedLookup.Last().Value.Last();
    return maxV1Item;
}

Obj is your class that contains V1 and V2, i have presumed that V1 is a primitive type like double. GetMaxItem is the method that returns the max-item.


If V1 and V2 can contain duplicates you could try this approach, where the key of each SortedDictionary is the V1 value and the value is another SortedDictionary with the V2-key and all related objects.

SortedDictionary<double, SortedDictionary<double, List<Obj>>> sortedLookup =
    new SortedDictionary<double, SortedDictionary<double, List<Obj>>>();

public void Add(Obj obj)
{
    SortedDictionary<double, List<Obj>> value;
    bool exists = sortedLookup.TryGetValue(obj.V1, out value);
    if(!exists)
    {
        value = new SortedDictionary<double, List<Obj>>(){{obj.V2, new List<Obj>{obj}}};
        sortedLookup.Add(obj.V1, value);
    }
    else
    {
        List<Obj> list;
        exists = value.TryGetValue(obj.V2, out list);
        if (!exists)
            list = new List<Obj>();
        list.Add(obj);
        value[obj.V2] = list;
        sortedLookup[obj.V1] = value;
    }
}

public Obj GetMaxItem()
{
    if (sortedLookup.Count == 0) return null;
    Obj maxV1Item = sortedLookup.Last().Value.Last().Value.Last();
    return maxV1Item;
}
Community
  • 1
  • 1
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • You are right regarding storing the max value. I already store it in my actual code but didn't copy it into the question for brevity sake. – Kabbalah Jul 13 '15 at 08:41
  • @Kabbalah: i have edited my answer since i doubt that your performance issue is caused by this. How do you call this code, how do you measure the performance issue? Maybe another approach would be better like using a `SortedSet`. We need more informations to help further. – Tim Schmelter Jul 13 '15 at 08:44
  • It does get called VERY, VERY often. Presorting just moves the problem to another place because the list is different each time. – Kabbalah Jul 13 '15 at 08:44
  • @Kabbalah: so the list changes quickly and you also need to access the max-item very often? – Tim Schmelter Jul 13 '15 at 08:45
  • Yes, that's basically it. The lists are generated at a much faster rate than a simple sort or filter can handle – Kabbalah Jul 13 '15 at 08:47
  • @Kabbalah: then try `SortedList` or (probably better) `SortedDictionary`. – Tim Schmelter Jul 13 '15 at 08:50
  • I'll give SortedList a try. Dictionary seems to be a no-go because of duplicate sort keys... – Kabbalah Jul 13 '15 at 08:53
  • @Kabbalah: what is `V1`, a `double` or `int`? – Tim Schmelter Jul 13 '15 at 09:01
  • double... I'm actually using Math.Abs and tolerance value to compare for equality. V2 is also a double – Kabbalah Jul 13 '15 at 09:03
  • @Kabbalah: have a look at my answer, i've shown an example with a `SortedDictionary>` that allows duplicates because the value is a collection. Maybe that is an option. – Tim Schmelter Jul 13 '15 at 09:12
  • I think I wasn't clear enough: Both V1 and V2 can have duplicates and their combination can also be duplicates. {{1,1}, {1,1}, {2, 2}, {2, 2}} is a valid list. – Kabbalah Jul 13 '15 at 09:18
  • @Kabbalah: maybe this is overcomplicating things but have a look at my final approach. – Tim Schmelter Jul 13 '15 at 09:38
  • I have to test that and compare. Will take a while till I answer. – Kabbalah Jul 13 '15 at 09:44
  • @Kabbalah: of course not tested, so likely that you will find bugs (fixed already one) – Tim Schmelter Jul 13 '15 at 10:08
  • I meant to test the actual runtime – Kabbalah Jul 13 '15 at 11:01
  • @Kabbalah: i knew what you meant, it wasn't an answer to your comment. I just wanted to warn you that it's compiling but not free of bugs. – Tim Schmelter Jul 13 '15 at 11:04
  • Haven't actually tested it yet, but after thinking about my problem again I must admit that I had left out a small but significant detail in my question... I simplified too much: The variables on which the list is sorted depends on a parameter... This means sometimes V1 must be highest, other times V2 or even V3. I'm sorry having misled you. But I'll still remember your answer in case I need it one day. – Kabbalah Jul 13 '15 at 11:35
  • "your first approach should perform well, it's simple and efficient" - it may be simple, but it's *not* efficient. – Rawling Jul 13 '15 at 12:04
  • @Rawling: depends on what you call efficient. It has O(N log N) complexity so if you don't need to get that item often(which is the case here) it might be efficient enough. – Tim Schmelter Jul 13 '15 at 12:09
  • @TimSchmelter `OrderBy` is not O(n). – Rawling Jul 13 '15 at 12:11
2

Non-LINQ (I took System.Drawing.Point struct for this example):

    static Point GetHighestXY(Point[] points)
    {
        Point max = default(Point);
        for (int i = 0; i < points.Length; i++)
        {
            if (points[i].X < max.X) continue;
            if (points[i].X > max.X) { max = points[i]; }
            else { if (points[i].Y > max.Y) max = points[i]; }
        }
        return max;
    }

Usage example:

        Point[] pts =  
        {
            new Point(55, 8),
            new Point(55, 10),
            new Point(10, 10),
            new Point(22, 11),
            new Point(16, 33),                
            new Point(4, 104)
        };

        Point max = GetHighestXY(pts);

        Console.WriteLine("X : {0}  Y : {1} ", max.X, max.Y);    

Result : enter image description here

Fabjan
  • 13,506
  • 4
  • 25
  • 52
  • I'm not looking for two elements but one. The both X and Y of this element should be maxed. In your example the correct answer would be just {55, 8} – Kabbalah Jul 13 '15 at 09:26
  • This is so far the most efficient solution, beating Aggregate by a small but still significant percentage in my tests. Even though I normally prefer LINQ I do need the small performance boost in this case. Since I think it's impossible to get a better complexity than O(n) in this specific case, I'll accept this answer. I do feel stupid though because I didn't consider the classic iterative approach myself. – Kabbalah Jul 13 '15 at 14:04
  • @Kabbalah thank you. I tend to avoid using LINQ in my code because sometimes it is really SLOW! I use it only if development speed for current task is more important than performance losses (or if it is faster which is rare). – Fabjan Jul 13 '15 at 14:16
2

As always, if you just want the maximum value, there's no need to do any sorting - Aggregate is O(n):

var maxByBoth = items.Aggregate(
    (bestSoFar, current) =>
        {
            if (current.V1 > bestSoFar.V1)
                return current;
            if (current.V1 == bestSoFar.V1 && current.V2 > bestSoFar.V2)
                return current;
            return bestSoFar;
        });
Rawling
  • 49,248
  • 7
  • 89
  • 127
  • 1
    What is the benefit of `Aggregate` here over a simple `foreach`-loop? – Tim Schmelter Jul 13 '15 at 12:14
  • The LINQ tag. I was going for "better than the solution in the question" rather than "better than another solution that is better than the solution in the question". – Rawling Jul 13 '15 at 12:23