3
public IEnumerable<string> ListFoldersInternal(IEnumerable<CloudBlobDirectory> folders)
{
    return new HashSet<string>(folders.Select(x => x.Prefix));
}

Is it a good choice to use HashSet to avoid duplicates and return IEnumerable?

Any ideas to improve this code?

  • 1
    It's the most efficient way http://stackoverflow.com/questions/30366669/most-efficient-way-to-remove-duplicates-from-a-list – fubo Jul 19 '16 at 11:40
  • If you call `ToArray()`, you might as well make the return type `CloudItem[]`. If the return type is `IList`, just call `ToList()` – Dennis_E Jul 19 '16 at 11:42
  • 1
    They made a method for the exact purpose of avoiding duplicates. It's called `Distinct()`. – Dennis_E Jul 19 '16 at 11:45
  • Your ListFolders method does not make use of your ListFoldersInternal method? if not then I'd rename them just because the implications could be confusing. – leetibbett Jul 19 '16 at 11:49

2 Answers2

4

A more readable way might be to use the LINQ extension Distinct:

return folders.Select(x => x.Prefix).Distinct();

It is implemented in a similar fashion (using its own cut-down hash set), although it evaluates lazily (yielding after returning each unique item).

Charles Mager
  • 25,735
  • 2
  • 35
  • 45
2

Using Hashset is good idea and definitely an effective way to avoid duplicates. The MSDN says:

The HashSet<T> class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order.

If you want you can use LINQ as well like

List<T> myList = ......;
List<T> removeDuplicates = myList.Distinct().ToList();
Rahul Tripathi
  • 168,305
  • 31
  • 280
  • 331