1

Below code is used to find all indices of a string that might occur only once in an array but the code isn't very fast. Does somebody know a faster and more efficient way to find unique strings in an array?

using System;
using System.Collections.Generic;
using System.Linq;

public static class EM
{
    // Extension method, using Linq to find indices.
    public static int[] FindAllIndicesOf<T>(this IEnumerable<T> values, T val)
    {
        return values.Select((b,i) => Equals(b, val) ? i : -1).Where(i => i != -1).ToArray();
    }
}


public class Program
{
    public static string FindFirstUniqueName(string[] names)
    {
        var results = new List<string>();
        for (var i = 0; i < names.Length; i++)
        {
            var matchedIndices = names.FindAllIndicesOf(names[i]);
            if (matchedIndices.Length == 1)
            {
                results.Add(names[matchedIndices[0]]);
                break;
            }
        }
        return results.Count > 0 ? results[0] : null;
    }


    public static void Main(string[] args)
    {
        Console.WriteLine("Found: " + FindFirstUniqueName(new[]
            {
                "James",
                "Bill",
                "Helen",
                "Bill",
                "Helen",
                "Giles",
                "James",
            }
        ));
    }
}
BadmintonCat
  • 9,416
  • 14
  • 78
  • 129
  • 1
    This O(n^2). Why don't you insert each name to Hashmap (name->numOfRecurrence) and then return all names that map to 1 - this will be O(n) – dWinder Jul 15 '18 at 13:19
  • 1
    You could just try `names.GroupBy(x => x).Where(grp => grp.Count() == 1).Select(grp => grp.First()).ToList()` – juharr Jul 15 '18 at 13:19
  • @DavidWinder that sounds like a good way to go but how do you create a HashMap (Dictionary) with names as keys and recurrence as value, optimally without loop? E.g. similar to `var dictionary = sequence.ToDictionary(item => item.Key, item => item.Value)` – BadmintonCat Jul 15 '18 at 13:34
  • @BadmintonCat No matter what you do the best you can get is O(n) for this so even if you use Linq, under the covers it's going to use a loop. – juharr Jul 15 '18 at 13:46

1 Answers1

0

Your solution has O(n^2) complexity. You can improve it to O(n) by using Hash-Map.

Consider a Hash-Map with which in each name has it number of recurrences in your original list. Now all you have to do is check all key in the dictionary (aka hash-map) and return all that equal to 1. Notice that check all key in this dictionary is less then o(n) because it can not hold more then n names.

To implement this dictionary in C# you do as follow:

List<string> stuff = new List<string>();
var groups = stuff.GroupBy(s => s).Select(
    s => new { Stuff = s.Key, Count = s.Count() });
var dictionary = groups.ToDictionary(g => g.Stuff, g => g.Count);

Taken from here or as suggested by juharr

O(n) is the minimum require as you will have to go over all names at least once.

dWinder
  • 11,597
  • 3
  • 24
  • 39