1

Say I have

List<int> ages  = new List<int>() { 8, 5, 3, 9, 2, 1, 7 };
List<int> marks = new List<int>() { 12, 17, 08, 15, 19, 02, 11 };

I can sort my marks by ages like this:

while (true)
{
  bool swapped = false;

  for (int i = 0; i < ages.Count - 1; i++)
    if (ages[i] > ages[i + 1])
    {
      int tmp = ages[i];
      ages[i] = ages[i + 1];
      ages[i + 1] = tmp;

      tmp = marks[i];
      marks[i] = marks[i + 1];
      marks[i + 1] = tmp;

      swapped = true;
    }

  if (!swapped)
    break;
}

Now I want to put this into a function that accepts any two lists. The first parameter will be the reference list, the numerical or comparable list. The second parameter will be the list containing the data.

For example:

public static void Sort<T>(List<T> RefList, List<T> DataList)
{
  // sorting logic here...
}

There are a few problems:

First of all, T is almost certainly not the same type in RefList and DataList. RefList might be dates, integers, or doubles; whereas DataList is free to be absolutely anything. I need to be able to receive two, arbitrary generic types.

Secondly, I cannot seem to use the > operator with the T in this line:

if (ages[i] > ages[i + 1])

Perhaps my whole approach is wrong.

By the way, I have read responses to similar questions that suggest that the two lists should be combined into a single list of a compound data type. This isn't practical at all for my application. All I want to do is write a static function that somehow sorts one list based on the elements of another.

user1002358
  • 2,852
  • 6
  • 22
  • 32
  • Zip the lists and sort lexicographically. – Mechanical snail Jan 09 '13 at 03:21
  • If you really want to go custom sorting rout - use `IComparable` or `IComparer` interface instead of `<`, or compare delegate as in OrderBy LINQ method. – Alexei Levenkov Jan 09 '13 at 03:32
  • -1 or not including the helpful compiler error: "cannot seem to use" is a poor description/summary. Looking up said error would have led you to the selected solution. (Which revolves around the fact that operators cannot be used with Generics and should have elicited a question like: "How can generic parameters be compared?") –  Jan 09 '13 at 03:32
  • 1
    I still don't like the custom bubble sort .. I thought we were past this in this day and age! – Simon Whitehead Jan 09 '13 at 03:34
  • possible duplicate of [C# Sort List Based on Another List](http://stackoverflow.com/questions/3355928/c-sharp-sort-list-based-on-another-list) – nawfal Oct 15 '13 at 15:11

4 Answers4

8

To sort one list the way you want you actually need to somehow keep references from items in first list to they weight/keys in the second list. No existing methods do that as you can't easily associate metadata with arbitrary values (i.e. if first list is list of int as in your case there is nothing to map to keys in second list). Your only reasonable option is to sort 2 lists at the same time and make association by index - again no existing classes to help.

It may be much easier to use solution that you reject. I.e. simply Zip and OrderBy, than recreate first list:

ages = ages
  .Zip(marks, (a,m)=> new {age = a; mark = m;})
  .OrderBy(v => v.mark)
  .Select(v=>v.age)
  .ToList();

Note (courtesy of phoog): if you need to do this type of sorting with Array there is Array.Sort that allows exactly this operatiion (see phoog's answer for details).

Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
  • 1
    Zip is not available in standard 3.5 LINQ :( Select-with-index and Join can be used, albeit it is a good bit more verbose and uses a different algorithm. –  Jan 09 '13 at 03:29
  • Zip is easy to implement I like this solution, but I don't think it'll be very efficient for lists. – Eli Algranti Jan 09 '13 at 03:33
  • 1
    "No existing classes..." see my answer; `Array.Sort()` has overloads that do this, though you're then faced with the problem of getting the data into arrays and then back into lists. – phoog Jan 09 '13 at 03:39
  • Using my sequential algorithm with the signature from @EliAlgranti, I was able to sort 1 million items in about 3 seconds while debugging. Using your LINQ/Zip idea, the same list is sorted in about 12ms. – user1002358 Jan 09 '13 at 03:49
  • 1
    @user1002358 that's because your sort algorithm is *the textbook example* of a sort algorithm with *polynomial time complexity*. See http://en.wikipedia.org/wiki/Bubble_sort. If you replace your algorithm with a better one, you'll see performance closer to the LINQ performance. In fact, the performance might even be better, since LINQ allocates several objects, including a new million-item list, that you wouldn't need if you worked directly on the lists. Then again, the 12ms time is probably good enough, in which case, improving your version would be an academic exercise, not a practical one. – phoog Jan 09 '13 at 03:59
  • @user1002358 By the way, since the performance is proportional to the square of the number of items, a million-item sort should take about 100 times longer than sorting 100,000 items or 100,000,000 times longer than sorting 100 items. If you sort 100,000 items, does it take 0.03 seconds? – phoog Jan 09 '13 at 04:14
  • @phoog I was actually wondering about the performance of OrderBy in this case. Zip would isolate OrderBy from the underlying lists so it doesn't have O(1) access to the elements. In this case OrderBy would have to be O(n^2) speed to remain O(1) memory or O(n) memory to achieve O(nlogn). Do you know how this works? Could OrderBy still be O(n^2)/O(1) and achieve that performance? 1 million items is not that much nowadays. – Eli Algranti Jan 09 '13 at 04:20
  • Actually since OrderBy would requires multiple iterations on the underlying enumerations it would have to be O(n) memory anyway to allow sorting. – Eli Algranti Jan 09 '13 at 04:22
  • @EliAlgranti if you decompile the code, you'll see that OrderBy returns an object that iterates the underlying enumerable only once; it allocates a buffer for the items and an int array with as many elements. It fills the int array with indexes from 0 to (itemcount - 1), and then sorts that array by indexing into the buffer, using quicksort. Then it iterates the array, returning each item from the buffer by key. So yes, it's O(n) memory, but it only iterates the enumerable once and achieves O(n log n) by using quicksort. – phoog Jan 09 '13 at 04:47
  • @EliAlgranti It doesn't matter whether the enumerable implements `IList`; it copies all the elements to the internal `Buffer` class anyway. I'm sure there's a reason for that, but I can't think what it would be. – phoog Jan 09 '13 at 04:50
  • @phoog read my second comment. It needs to copy them to a buffer because an enumerable can only be iterated once (in the general case), which would make it impossible to sort even with an O(n^2) algorithm (which requires n iterations through the enumerable.) – Eli Algranti Jan 09 '13 at 05:04
  • @EliAlgranti but Linq to Objects is full of optimizations on certain well-known subtypes of `IEnumerable`; indexing directly into the list (if it *is* a list) could be one of those optimizations, I presume. – phoog Jan 09 '13 at 05:15
  • @phoog I addressed that as well. Using Zip isolates linq from the original underlying lists, so OrderBy would have to work with plain IEnumerables. There may be a version of OrderBy used with lists that creates a buffer with only the ordered indexes of the data in the lists not the actual data, but Zips prevents this implementation to be used. – Eli Algranti Jan 09 '13 at 05:31
  • @EliAlgranti I was describing OrderBy generally. There is no "version" of OrderBy that operates directly on lists. Special-case logic in the Enumerable class is not implemented with overloads, by the way, but with conditional logic. There is no conditional logic that allows OrderBy to operate directly on lists. I am speculating that this optimization would be possible, but perhaps there is a reason to avoid it that I cannot think of. Your comments do not address that speculation. – phoog Jan 09 '13 at 05:45
8

There's no framework method to do this with List<T>, but if you don't mind putting the data into two arrays, you can use one of the Array.Sort() overloads that takes two arrays as arguments. The first array is the keys, and the second is the values, so your code might look like this (leaving aside the step of getting arrays from the lists):

Array.Sort(ages, marks);

The specifics of getting the values into arrays and then back into lists would depend, among other things, on whether you need to end up with the same list sorted appropriately or whether it's okay to return a new list with the data in the desired order.

phoog
  • 42,068
  • 6
  • 79
  • 117
  • +1. Nice find... and thanks a lot for useful comments. If using with Lists one need to measure performance to see if converting back and forth eats all saving from not using extra object due to Zip/Select+OrderBy. – Alexei Levenkov Jan 09 '13 at 05:18
4

Use:

public static void Sort<TR, TD>(IList<TR> refList, IList<TD> dataList)
        where TR : System.IComparable<TR>
        where TD : System.IComparable<TD>
{
 ...
}

and then use:

refList[i].CompareTo(refList[i+1])

instead of the operators.

.Net numbers already implement IComparable, and you can use overloads that allow you to specify a different IComparable.

Eli Algranti
  • 8,707
  • 2
  • 42
  • 50
  • This is exactly how I implemented it. Thank you. (Except I don't need `IComparable` on `TD`) – user1002358 Jan 09 '13 at 03:34
  • if the type does not implement IComparable you can also pass in a custom IComparer, also TD does not need to be compared in this instance – Gary.S Jan 09 '13 at 03:35
  • +1. for correct interface (also second version as in all Sort/OrderBy calls would be useful). I'd personally avoid part of writing sort by hand as writing properly working sort methods is not exactly trivial and clear duplication of effort... – Alexei Levenkov Jan 09 '13 at 03:35
  • 1
    @Alexei this is only the interface, you can probably use pre-existing sorting algorithms (which are going to require and IComparable anyway) in the implementation. – Eli Algranti Jan 09 '13 at 03:37
2

If I understand "I can sort my marks by ages like this:" properly,

I would like to suggest the below to eliminate much confusion.

struct Student{
    int age;
    int marks;
};

List<Student> students = {{8,12}, ...};

Now you can sort according to age and marks is accordingly sorted automatically.

If it is not possible, you need to fix the code as below.

First of all, T is almost certainly not the same type in RefList and DataList.

Then you need 2 parameters T1, T2. Just T implies the types are the same.

public static void Sort<RefType, DataType>(List<RefType> RefList, List<DataType> DataList)
{

You can also zip the two lists together as suggested by Mechanical Snail and explained in Looping through 2 Lists at once

Community
  • 1
  • 1
Karthik T
  • 31,456
  • 5
  • 68
  • 87
  • I specifically said I didn't want to combine the data together into a composite type. – user1002358 Jan 09 '13 at 03:24
  • 1
    @user1002358 There will be a composite type involed (although it might be anonymous and/or only internal to the method; even Zip'ing uses a composite type) to use *any* of the built-in Sort (List or LINQ) methods efficiently. –  Jan 09 '13 at 03:26
  • +1. Totally agree with fact that there is no way to avoid association of some sort. Custom code to sort 2 arrays at the same time is an option, but one need to have very good reasons to go custom sorting route. – Alexei Levenkov Jan 09 '13 at 03:30
  • @AlexeiLevenkov Because I find myself doing this type of sorting very frequently in my job, and I wanted to write a function that does it for any two lists. I don't want to have to combine, sort, and extract my data, I just want to `myList.Sort(myRef);` and it's done. – user1002358 Jan 09 '13 at 03:33
  • @user1002358 So, write that function *once* (note that the combining can be done *inside* said function) .. –  Jan 09 '13 at 03:34
  • @user1002358 The example Karthik gave you does work for any two lists. – JLRishe Jan 09 '13 at 03:37