0

I am looking for a way to create a new collection from an old one, that contains the same elements.

For HashSet<T> it works like this:

HashSet<T> oldSet = ... // consider it filled with elements
HashSet<T> newSet = new HashSet<T>(oldSet);

For List<T> it is analogous:

List<T> oldList = ... // consider it filled with elements
List<T> newList = new List<T>(oldList);

As far as I know, all ICollection<T> implementations have this type of copy constructor.

Is there a method (let's call it CreateCopy for now) that does this for all ICollection<T>? So that I can call it like this?

ICollection<T> oldColl = ... // can be List, HashSet, ...
ICollection<T> newColl = oldColl.CreateCopy(); // will have the same type as oldColl

If not, how can I write my own method to achieve this? The only idea that comes to my mind looks like this:

public static ICollection<T> CreateCopy<T>(this ICollection<T> c)
{
    if (c is List<T>) return new List<T>(c);
    else if (c is HashSet<T>) return new HashSet<T>(c);
    else if ...
}

but of course this is a horrible solution - whenever a new implementation of ICollection<T> comes around, I need to update that method...

Kjara
  • 2,504
  • 15
  • 42

3 Answers3

1

One option to do it without reflection is this:

public static class Extensions {
    public static TCollection CreateCopy<TCollection, TItem>(this TCollection c) where TCollection : ICollection<TItem>, new() {
        var copy = new TCollection();
        foreach (var item in c) {
            copy.Add(item);
        }
        return copy;
    }
}

This has the following benefits:

  • Type-safe. Cannot pass instance of ICollection<T> that does not have parameterless constructor (and there are such implementations).
  • No reflection.
  • Returns the same type you passed in (so HashSet<T> if you passed HashSet<T>, not generic ICollection<T>).

Drawbacks:

  • Major one is syntax you have to use to call it:

    var set = new HashSet<string>();
    // have to specify type arguments because they cannot be inferred
    var copy = set.CreateCopy<HashSet<string>, string>();
    
  • Cannot pass interface (ICollection<T> itself) - should pass concrete class (HashSet<T>, List<T> etc).

Evk
  • 98,527
  • 8
  • 141
  • 191
0

If it implements IEnumerable<T> you can use an identity projection:

var copy = myEnumerable.Select(item => item);

Of course, this is only a shallow copy, if T is a reference type, you will only be copying the reference, therefore both enumerables will be pointing to the same objects.

Also, you loose the specialization of the orignal enumerable, but that can't be avoided unless you actually write an overload for all expected collections.

InBetween
  • 32,319
  • 3
  • 50
  • 90
  • Do you know if, in case `myEnumerable` implements `ICollection`, the cast `(ICollection)myEnumerable.Select(item => item)` is safe? – Kjara Oct 11 '17 at 13:07
  • 1
    @Kjara No, that won't work, as you can see by just running the code and seeing the exception get thrown. – Servy Oct 11 '17 at 13:23
  • @Kjara No, an `ICollection` is an `IEnumerable` but it doesn't go the way around; not any `IEnumerable` is an `ICollection`. – InBetween Oct 11 '17 at 13:33
0

There actually are three possible ways to do this:

public static ICollection<T> CreateCopyReflection<T> (this ICollection<T> c)
{
    var n = (ICollection<T>) Activator.CreateInstance (c.GetType());
    foreach (var item in c)
        n.Add (item);
    return n;
}

public static IEnumerable<T> CreateCopyLinq<T> (this IEnumerable<T> c) => c.Select (arg => arg);

public static IEnumerable<T> CreateCopyEnumeration<T> (this IEnumerable<T> c)
{
    foreach (var item in c)
        yield return item;
}

Note that we can use IEnumerables here without worrying, as ICollection<T> derives from IEnumerable<T>.

The first solution creates a copy using reflection, the second one using Linq and the third one using enumeration. We can now profile this with the following code:

var myList = Enumerable.Range (0, 100000000).ToList();
var trueCopy = new List<int> (myList);
var time = Environment.TickCount;
var copyOne = myList.CreateCopyReflection();
Console.WriteLine($"Refelection copy: {Environment.TickCount - time}");
time = Environment.TickCount;
var copyTwo = myList.CreateCopyLinq ();
Console.WriteLine ($"Linq copy: {Environment.TickCount - time}");
time = Environment.TickCount;
var copyThree = myList.CreateCopyEnumeration ();
Console.WriteLine ($"Enumeration copy: {Environment.TickCount - time}");
time = Environment.TickCount;

Which results in:

Reflection copy: 1375
Linq copy: 0
Enumeration copy: 0

However, we have to keep in mind that c# is lazy here, which means that it didn't actually calculate the values, so we only get comparable results when enumerating the IEnumerables:

var myList = Enumerable.Range (0, 100000000).ToList();
var trueCopy = new List<int> (myList);
var time = Environment.TickCount;
var copyOne = myList.CreateCopyReflection().ToList();
Console.WriteLine($"Reflection copy: {Environment.TickCount - time}");
time = Environment.TickCount;
var copyTwo = myList.CreateCopyLinq ().ToList();
Console.WriteLine ($"Linq copy: {Environment.TickCount - time}");
time = Environment.TickCount;
var copyThree = myList.CreateCopyEnumeration ().ToList();
Console.WriteLine ($"Enumeration copy: {Environment.TickCount - time}");
time = Environment.TickCount;

Which results in:

Reflection copy: 1500
Linq copy: 1625
Enumeration copy: 3140

So we can see that the enumeration is the slowest, followed by linq and then reflection. However, reflection and linq are very close and linq has the huge (at least in a lot of cases) advantage that it's lazy (as well as the enumeration), which is why I'd use it.


It's quite interesting to compare to if cascades:

private static void Main ()
{
    var myList = Enumerable.Range (0, 100000000).ToList();
    var trueCopy = new List<int> (myList);
    var time = Environment.TickCount;
    var copyOne = myList.CreateCopyReflection().ToList();
    Console.WriteLine($"Reflection copy: {Environment.TickCount - time}");
    time = Environment.TickCount;
    var copyTwo = myList.CreateCopyLinq ().ToList();
    Console.WriteLine ($"Linq copy: {Environment.TickCount - time}");
    time = Environment.TickCount;
    var copyThree = myList.CreateCopyEnumeration ().ToList();
    Console.WriteLine ($"Enumeration copy: {Environment.TickCount - time}");
    time = Environment.TickCount;
    var copyFour = myList.CreateCopyCascade ();
    Console.WriteLine($"Cascade copy: {Environment.TickCount - time}");
    time = Environment.TickCount;

    Console.ReadLine ();
}

public static ICollection<T> CreateCopyReflection<T> (this ICollection<T> c)
{
    var n = (ICollection<T>) Activator.CreateInstance (c.GetType());
    foreach (var item in c)
        n.Add (item);
    return n;
}

public static IEnumerable<T> CreateCopyLinq<T> (this IEnumerable<T> c) => c.Select (arg => arg);

public static IEnumerable<T> CreateCopyEnumeration<T> (this IEnumerable<T> c)
{
    foreach (var item in c)
        yield return item;
}

public static ICollection<T> CreateCopyCascade<T> (this ICollection<T> c)
{
    if (c.GetType() == typeof(List<T>))
        return new List<T> (c);
    if (c.GetType() == typeof(HashSet<T>))
        return new HashSet<T> (c);
    //...
    return null;
}

Which results in:

Reflection copy: 1594
Linq copy: 1750
Enumeration copy: 3141
Cascade copy: 172

So we can see that the cascade is way faster - however, it won't work if other collections are created which derive from ICollection, as it won't know them, so this solution isn't very advisable.

MetaColon
  • 2,895
  • 3
  • 16
  • 38