135

I can sort a list using Sort or OrderBy. Which one is faster? Are both working on same algorithm?

List<Person> persons = new List<Person>();
persons.Add(new Person("P005", "Janson"));
persons.Add(new Person("P002", "Aravind"));
persons.Add(new Person("P007", "Kazhal"));

1.

persons.Sort((p1,p2)=>string.Compare(p1.Name,p2.Name,true));

2.

var query = persons.OrderBy(n => n.Name, new NameComparer());

class NameComparer : IComparer<string>
{
    public int Compare(string x,string y)
    {
      return  string.Compare(x, y, true);
    }
}
abatishchev
  • 98,240
  • 88
  • 296
  • 433
user215675
  • 4,991
  • 9
  • 36
  • 40
  • 63
    I can't believe that none of the answers mentioned this, but the biggest difference is this: OrderBy makes a sorted copy of the Array or List, while Sort actually sorts it in place. – PRMan Feb 16 '18 at 23:39
  • 3
    as title say comparison, I would like to add that OrderBy is stable and sort is stable upto 16 elements as upto 16 elements insertion sort is used if elements are more than that then it switches to other unstable algos Edit : stable means maintaining the relative order of elements having same key. – Eklavyaa Jun 04 '18 at 09:13
  • @PRMan Nope, OrderBy creates a lazy enumerable. Only if you call a method such as ToList on the returned enumerable do you get a sorted copy. – Stewart May 21 '19 at 09:57
  • 1
    @Stewart, You don't consider the Array.Copy or Collection.Copy into TElement[] in Buffer in System.Core/System/Linq/Enumerable.cs to be a copy? And if you call ToList on the IEnumerable, you could momentarily have 3 copies in memory at once. This is a problem for very large arrays, which was part of my point. Also, if you need the same sorted order more than once, then calling Sort in-place once is much more efficient than repeatedly sorting the List, because of its permanence. – PRMan May 21 '19 at 18:03
  • 1
    @PRMan Oh, you meant a sorted copy is built internally. Still that's inaccurate, as OrderBy doesn't create the copy - from what I can see, this is done by the GetEnumerator method when you actually begin to loop through the collection. I just tried stepping through my code, and found that the code that populates a variable from a LINQ expression runs almost instantly, but when you go into the foreach loop it spends time sorting it. I guess when I've a bit more time I should spend some trying to figure out how it works behind the scenes. – Stewart May 22 '19 at 09:16

6 Answers6

137

No, they aren't the same algorithm. For starters, the LINQ OrderBy is documented as stable (i.e. if two items have the same Name, they'll appear in their original order).

It also depends on whether you buffer the query vs iterate it several times (LINQ-to-Objects, unless you buffer the result, will re-order per foreach).

For the OrderBy query, I would also be tempted to use:

OrderBy(n => n.Name, StringComparer.{yourchoice}IgnoreCase);

(for {yourchoice} one of CurrentCulture, Ordinal or InvariantCulture).

List<T>.Sort

This method uses Array.Sort, which uses the QuickSort algorithm. This implementation performs an unstable sort; that is, if two elements are equal, their order might not be preserved. In contrast, a stable sort preserves the order of elements that are equal.

Enumerable.OrderBy

This method performs a stable sort; that is, if the keys of two elements are equal, the order of the elements is preserved. In contrast, an unstable sort does not preserve the order of elements that have the same key. sort; that is, if two elements are equal, their order might not be preserved. In contrast, a stable sort preserves the order of elements that are equal.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • 7
    If you use .NET Reflector or ILSpy to crack open `Enumerable.OrderBy` and drill down into its internal implementation, you can see that the OrderBy sorting algorithm is a variant of QuickSort that does a stable sort. (See `System.Linq.EnumerableSorter`.) Thus, `Array.Sort` and `Enumerable.OrderBy` can both be expected to have _O(N log N)_ execution times, where _N_ is the number of elements in the collection. – John Beyer Jun 26 '13 at 16:25
  • @Marc I don't quite follow what the difference would be if two elements were equal and their order was not preserved. This certainly does not look like a problem for primitive data types. But even for a reference type, why would it matter, if I were to sort, person with name Marc Gravell appeared before another person with name Marc Gravell (for instance :))? I am not questioning your answer/knowledge, rather looking for an application of this scenario. – Mukus Mar 17 '14 at 02:58
  • 7
    @Mukus imagine you sort a company address book by name (or indeed by date of birth) - there are inevitably going to be duplicates. The question is ultimately: what happens for them? Is the sub-order defined? – Marc Gravell Mar 17 '14 at 07:45
104

Why not measure it:

class Program
{
    class NameComparer : IComparer<string>
    {
        public int Compare(string x, string y)
        {
            return string.Compare(x, y, true);
        }
    }

    class Person
    {
        public Person(string id, string name)
        {
            Id = id;
            Name = name;
        }
        public string Id { get; set; }
        public string Name { get; set; }
    }

    static void Main()
    {
        List<Person> persons = new List<Person>();
        persons.Add(new Person("P005", "Janson"));
        persons.Add(new Person("P002", "Aravind"));
        persons.Add(new Person("P007", "Kazhal"));

        Sort(persons);
        OrderBy(persons);

        const int COUNT = 1000000;
        Stopwatch watch = Stopwatch.StartNew();
        for (int i = 0; i < COUNT; i++)
        {
            Sort(persons);
        }
        watch.Stop();
        Console.WriteLine("Sort: {0}ms", watch.ElapsedMilliseconds);

        watch = Stopwatch.StartNew();
        for (int i = 0; i < COUNT; i++)
        {
            OrderBy(persons);
        }
        watch.Stop();
        Console.WriteLine("OrderBy: {0}ms", watch.ElapsedMilliseconds);
    }

    static void Sort(List<Person> list)
    {
        list.Sort((p1, p2) => string.Compare(p1.Name, p2.Name, true));
    }

    static void OrderBy(List<Person> list)
    {
        var result = list.OrderBy(n => n.Name, new NameComparer()).ToArray();
    }
}

On my computer when compiled in Release mode this program prints:

Sort: 1162ms
OrderBy: 1269ms

UPDATE:

As suggested by @Stefan here are the results of sorting a big list fewer times:

List<Person> persons = new List<Person>();
for (int i = 0; i < 100000; i++)
{
    persons.Add(new Person("P" + i.ToString(), "Janson" + i.ToString()));
}

Sort(persons);
OrderBy(persons);

const int COUNT = 30;
Stopwatch watch = Stopwatch.StartNew();
for (int i = 0; i < COUNT; i++)
{
    Sort(persons);
}
watch.Stop();
Console.WriteLine("Sort: {0}ms", watch.ElapsedMilliseconds);

watch = Stopwatch.StartNew();
for (int i = 0; i < COUNT; i++)
{
    OrderBy(persons);
}
watch.Stop();
Console.WriteLine("OrderBy: {0}ms", watch.ElapsedMilliseconds);

Prints:

Sort: 8965ms
OrderBy: 8460ms

In this scenario it looks like OrderBy performs better.


UPDATE2:

And using random names:

List<Person> persons = new List<Person>();
for (int i = 0; i < 100000; i++)
{
    persons.Add(new Person("P" + i.ToString(), RandomString(5, true)));
}

Where:

private static Random randomSeed = new Random();
public static string RandomString(int size, bool lowerCase)
{
    var sb = new StringBuilder(size);
    int start = (lowerCase) ? 97 : 65;
    for (int i = 0; i < size; i++)
    {
        sb.Append((char)(26 * randomSeed.NextDouble() + start));
    }
    return sb.ToString();
}

Yields:

Sort: 8968ms
OrderBy: 8728ms

Still OrderBy is faster

Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928
  • 4
    I think, it's much different of sorting a very small list (3 items) 1000000 times, or by sorting a very large list (1000000 items) just a few times. Both is very relevant. In practice, medium size of list (what's medium? ... let's say 1000 items for now) is most interesting. IMHO, sorting lists with 3 items is not very meaningful. – Stefan Steinegger Dec 02 '09 at 12:58
  • @Stefan, in reality the list may indeed be more. The point is it demonstrates the speed difference. – James Dec 02 '09 at 13:06
  • 26
    Note that there is a difference between "faster" and "noticably faster". In your last example the difference was about a quarter of a second. Is the user going to notice? Is it unacceptable for the user to wait almost nine seconds for the result? If the answers to both questions are "no" then it really doesn't matter which one you pick from a performance perspective. – Eric Lippert Dec 02 '09 at 16:10
  • 14
    Note also that the test here sorts the list before starting the stopwatch, so we are comparing how the two algorithms compare when faced with sorted input. This may be quite different than their relative performance with unsorted input. – phoog Feb 06 '12 at 06:00
  • 5
    These results are pretty surprising IMHO, considering the fact that `LINQ` has to spend additional memory compared to an in-place `List.Sort` implementation. I am not sure if they improved this in newer .NET versions, but on my machine (i7 3rd gen 64-bit .NET 4.5 release) `Sort` outperforms `OrderBy` in all cases. Furthermore, by looking at `OrderedEnumerable` source code, it seems that it creates **three** additional arrays (first a `Buffer`, then an array of projected keys, then an array of indices) before finally calling Quicksort to sort the array of indices in place. – vgru Sep 17 '13 at 12:33
  • 2
    ...and then after all this, there is the `ToArray` call which creates the resulting array. Memory operations and array indexing are incredibly fast operations, but I still can't find the logic behind these results. – vgru Sep 17 '13 at 12:37
  • 1
    The logic is that Array.Copy boils down to a single x86 instruction, which is lightning fast on Intel. – PRMan May 21 '19 at 18:07
  • 2
    I think your method for `Sort` is problematic, because it mutates the supplied `List`, which makes future sorts faster because the list is already sorted. I would suggest creating a fresh list of random strings for each iteration so distribution of results is more uniform. – extremeandy Jun 21 '19 at 00:19
  • 1
    notice that you are sorting the list first and then running OrderBy. In order to compare you need to run once OrderBy() and another time Sort() on the very same elements. When running OrderBy on a sorted input it can show good performance. So your summary that OrderBy is can be inaccurate of even wrong – Samer Aamar Dec 07 '20 at 13:16
64

Darin Dimitrov's answer shows that OrderBy is slightly faster than List.Sort when faced with already-sorted input. I modified his code so it repeatedly sorts the unsorted data, and OrderBy is in most cases slightly slower.

Furthermore, the OrderBy test uses ToArray to force enumeration of the Linq enumerator, but that obviously returns a type (Person[]) which is different from the input type (List<Person>). I therefore re-ran the test using ToList rather than ToArray and got an even bigger difference:

Sort: 25175ms
OrderBy: 30259ms
OrderByWithToList: 31458ms

The code:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;

class Program
{
    class NameComparer : IComparer<string>
    {
        public int Compare(string x, string y)
        {
            return string.Compare(x, y, true);
        }
    }

    class Person
    {
        public Person(string id, string name)
        {
            Id = id;
            Name = name;
        }
        public string Id { get; set; }
        public string Name { get; set; }
        public override string ToString()
        {
            return Id + ": " + Name;
        }
    }

    private static Random randomSeed = new Random();
    public static string RandomString(int size, bool lowerCase)
    {
        var sb = new StringBuilder(size);
        int start = (lowerCase) ? 97 : 65;
        for (int i = 0; i < size; i++)
        {
            sb.Append((char)(26 * randomSeed.NextDouble() + start));
        }
        return sb.ToString();
    }

    private class PersonList : List<Person>
    {
        public PersonList(IEnumerable<Person> persons)
           : base(persons)
        {
        }

        public PersonList()
        {
        }

        public override string ToString()
        {
            var names = Math.Min(Count, 5);
            var builder = new StringBuilder();
            for (var i = 0; i < names; i++)
                builder.Append(this[i]).Append(", ");
            return builder.ToString();
        }
    }

    static void Main()
    {
        var persons = new PersonList();
        for (int i = 0; i < 100000; i++)
        {
            persons.Add(new Person("P" + i.ToString(), RandomString(5, true)));
        } 

        var unsortedPersons = new PersonList(persons);

        const int COUNT = 30;
        Stopwatch watch = new Stopwatch();
        for (int i = 0; i < COUNT; i++)
        {
            watch.Start();
            Sort(persons);
            watch.Stop();
            persons.Clear();
            persons.AddRange(unsortedPersons);
        }
        Console.WriteLine("Sort: {0}ms", watch.ElapsedMilliseconds);

        watch = new Stopwatch();
        for (int i = 0; i < COUNT; i++)
        {
            watch.Start();
            OrderBy(persons);
            watch.Stop();
            persons.Clear();
            persons.AddRange(unsortedPersons);
        }
        Console.WriteLine("OrderBy: {0}ms", watch.ElapsedMilliseconds);

        watch = new Stopwatch();
        for (int i = 0; i < COUNT; i++)
        {
            watch.Start();
            OrderByWithToList(persons);
            watch.Stop();
            persons.Clear();
            persons.AddRange(unsortedPersons);
        }
        Console.WriteLine("OrderByWithToList: {0}ms", watch.ElapsedMilliseconds);
    }

    static void Sort(List<Person> list)
    {
        list.Sort((p1, p2) => string.Compare(p1.Name, p2.Name, true));
    }

    static void OrderBy(List<Person> list)
    {
        var result = list.OrderBy(n => n.Name, new NameComparer()).ToArray();
    }

    static void OrderByWithToList(List<Person> list)
    {
        var result = list.OrderBy(n => n.Name, new NameComparer()).ToList();
    }
}
phoog
  • 42,068
  • 6
  • 79
  • 117
  • 2
    I run the test code now in LinqPad 5 (.net 5) and the `OrderByWithToList` takes the same time as `OrderBy`. – dovid Mar 06 '17 at 19:47
42

I think it's important to note another difference between Sort and OrderBy:

Suppose there exists a Person.CalculateSalary() method, which takes a lot of time; possibly more than even the operation of sorting a large list.

Compare

// Option 1
persons.Sort((p1, p2) => Compare(p1.CalculateSalary(), p2.CalculateSalary()));
// Option 2
var query = persons.OrderBy(p => p.CalculateSalary()); 

Option 2 may have superior performance, because it only calls the CalculateSalary method n times, whereas the Sort option might call CalculateSalary up to 2n log(n) times, depending on the sort algorithm's success.

Tim
  • 88
  • 6
Omer Raviv
  • 11,409
  • 5
  • 43
  • 82
  • 5
    This is true, though there is a solution to that problem, namely, to keep the data in an array and use the Array.Sort overload that takes two arrays, one of keys and the other of values. In filling the key array, you will call CalculateSalary `n` times. This is obviously not as convenient as using OrderBy. – phoog Jun 09 '15 at 02:58
23

In a nutshell :

List/Array Sort() :

  • Unstable sort.
  • Done in-place.
  • Use Introsort/Quicksort.
  • Custom comparison is done by providing a comparer. If comparison is expensive, it might be slower than OrderBy() (which allow to use keys, see below).

OrderBy/ThenBy() :

  • Stable sort.
  • Not in-place.
  • Use Quicksort. Quicksort is not a stable sort. Here is the trick : when sorting, if two elements have equal key, it compares their initial order (which has been stored before sorting).
  • Allows to use keys (using lambdas) to sort elements on their values (eg : x => x.Id). All keys are extracted first before sorting. This might result in better performance than using Sort() and a custom comparer.

Sources: MDSN, reference source and dotnet/coreclr repository (GitHub).

Some of the statements listed above are based on current .NET framework implementation (4.7.2). It might change in the future.

tigrou
  • 4,236
  • 5
  • 33
  • 59
-2

I just want to add that orderby is way more useful.

Why? Because I can do this:

Dim thisAccountBalances = account.DictOfBalances.Values.ToList
thisAccountBalances.ForEach(Sub(x) x.computeBalanceOtherFactors())
thisAccountBalances=thisAccountBalances.OrderBy(Function(x) x.TotalBalance).tolist
listOfBalances.AddRange(thisAccountBalances)

Why complicated comparer? Just sort based on a field. Here I am sorting based on TotalBalance.

Very easy.

I can't do that with sort. I wonder why. Do fine with orderBy.

As for speed it's always O(n).

James Skemp
  • 8,018
  • 9
  • 64
  • 107
user4951
  • 32,206
  • 53
  • 172
  • 282
  • 4
    Question: The O(n) Time(I assume) in your answer refers to OrderBy or Comparer? I don't think quick sort can achieve O(N) time. – Kevman Mar 23 '18 at 17:22