1

For a given a space separated list of numbers, what is the most effecient way of counting the total pairs of numbers which have a difference of N.

e.g. command line in put would be:

5 2

where 5 is the count of numbers to follow and 2 is the difference required

1 5 3 4 2

the 5 numbers to be considered

Output should be 3 because (5,3), (4,2) and (3,1) all have a diff of 2

I can get this algorithm to work, but is there a more efficient way of doing this if you have large sets of numbers to work with? I have incluced three comparison options and the second one should be better than the third but is there something I'm forgetting which could make it much quicker?

private static void Difference()
{
    string[] firstInput = SplitInput(Console.ReadLine());

    int numberOfNumbers = int.Parse(firstInput[0]);
    int diffOfNumbers = int.Parse(firstInput[1]);

    string[] secondInput = SplitInput(Console.ReadLine());

    List<int> numbers = secondInput.Select(x => Int32.Parse(x)).ToList();

    int possibleCombinations = 0;

    // Option 1
    foreach (int firstNumber in numbers)
    {
        List<int> compareTo = numbers.GetRange(numbers.IndexOf(firstNumber) + 1, numbers.Count - numbers.IndexOf(firstNumber) - 1);

        foreach (int secondNumber in compareTo)
        {
            int diff = firstNumber - secondNumber;

            if (Math.Abs(diff) == diffOfNumbers)
            {
                possibleCombinations++;
            }
        }
    }

    // Option 2
    foreach (int firstNumber in numbers)
    {
        if (numbers.Contains(firstNumber + diffOfNumbers))
        {
                possibleCombinations++;
        }
    }

    // Option 3
    foreach (int firstNumber in numbers)
    {
        foreach (int secondNumber in numbers)
        {
            int diff = firstNumber - secondNumber;

            if(Math.Abs(diff) == diffOfNumbers)
            {
                possibleOptions++;
            }
        }
    }

    Console.WriteLine(string.Format("Possible number of options are: {0}", possibleCombinations));
    Console.ReadLine();
}

private static string[] SplitInput(string input)
{
    return input.Split(new char[1] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
}
Ciaran Martin
  • 578
  • 4
  • 12

4 Answers4

4

If duplicate numbers are not allowed or to be ignored (only count unique pairs), you could use a HashSet<int>:

HashSet<int> myHashSet = ...
int difference = ...
int count;
foreach (int number in myHashSet)
{
    int counterpart = number - difference;
    if (myHashSet.Contains(counterpart))
    {
        count++;
    }
}
C.Evenhuis
  • 25,996
  • 2
  • 58
  • 72
  • So assuming there are no duplicates, will HashSet.Contains be faster and more efficient than List.Contains? If so, by much? – Ciaran Martin Oct 03 '14 at 15:26
  • Depends. See here for a very thorough performance analysis. http://stackoverflow.com/questions/150750/hashset-vs-list-performance – kha Oct 03 '14 at 15:35
  • Funny! I was just about to post that same link in answer to my own question! – Ciaran Martin Oct 03 '14 at 15:36
  • A former colleague of mine wrote an interesting article on the performance also: http://theburningmonk.com/2011/03/hashset-vs-list-vs-dictionary/ – Ciaran Martin Oct 03 '14 at 15:50
  • @CiaranMartin `HashSet` containing unique values and designed toward looking up values was my primary reason for using one. I bet there are ways to improve performance, if you sort them you could use one iteration and remember the last numbers until they fall "outside the range" of `current - difference`. However it would make code less readable, so should only be done if the performance increase really matters. – C.Evenhuis Oct 04 '14 at 17:23
2

Given the constraints of the problem, where N is the "count of numbers to follow" [1..N], and M is the difference (N=5 and M=2 in the example), why not just return N - M ?

Dave Mackersie
  • 1,033
  • 10
  • 14
  • 1
    I don't think your assumption of the numbers being 1..N is correct. My guess is that is just sample data that happens to follow that format. – John Koerner Oct 03 '14 at 15:30
  • I think you may be right. Rereading it now, it looks like he intends "1 5 3 4 2" to be a second line of input, rather than an inferred set of numbers. – Dave Mackersie Oct 03 '14 at 15:41
1

This is done easily with LINQ, allowing for duplicates:

var dict = numbers.GroupBy(n => n).ToDictionary(g => g.Key, g => g.Count());
return dict.Keys.Where(n => dict.ContainsKey(difference-n)).Select(n => dict[difference - n]).Sum();

In the first line we create a dictionary where the keys are the distinct numbers in the input list (numbers) and the values are how many times they appear.

In the second, for each distinct number in the list (equivalent to the keys of the dictioanry) we look to see if the dictionary contains a key for the target number. If so, we add the number of times that target number appeared, which we previously stored as the value for that key. If not we add 0. Finally we sum it all up.


Note in theory this could cause arithmetic overflows if there's no bound other than Int.MinValue and Int.MaxValue on the items in the list. To get around this we need to do a "safe" check, which first makes sure that the difference won't be out of bounds before we try to calculate it. That might look like:

int SafeGetCount(int difference, int number, Dictionary<int,int> dict)
{
    if(difference < 0 && number < 0 && int.MinValue - difference > number)
        return 0;
    if(difference > 0 && number > 0 && int.MaxValue - difference < number)
        return 0;
    return dict.ContainsKey(difference-number) ? dict[difference - number] : 0;
}

Update

There are a couple of things note entirely clear from your question, like whether you actually want to count duplicate pairs multiple times, and does swapping the numbers count as two different pairs. e.g. if (1,4) is a pair, is (4,1)? My answer above assumes that the answer to both of those questions is yes.

If you don't want to count duplicate pairs multiple times, then go with the HashSet solution from other answers. If you do want to count duplicate pairs but don't want to count twice by swapping the values in the pair, you have to get slightly more complex. E.g.:

var dict = numbers.GroupBy(n => n).ToDictionary(g => g.Key, g => g.Count());
var sum = dict.Keys.Where(n => n*2 != difference)
      .Where(n => dict.ContainsKey(difference-n))
      .Select(n => dict[difference - n]).Sum()/2;
if(n%2 == 0)
{
    sum += dict.ContainsKey(n/2) ? dict[n/2] : 0
}    
return sum;
Ben Aaronson
  • 6,955
  • 2
  • 23
  • 38
0

how about sorting the list then iterating over it.

int PairsWithMatchingDifferenceCount(
        IEnumerable<int> source,
        int difference)
{
    var ordered = source.OrderBy(i => i).ToList();
    var count = ordered.Count;
    var result = 0;
    for (var i = 0; i < count - 1; i++)
    {
        for (var j = i + 1; j < count; j++)
        {
            var d = Math.Abs(ordered[j] - ordered[i]);
            if (d == difference)
            {
                result++;
            }
            else if (d > difference)
            {
                break;
            }
        } 
    }

    return result;
}

so, as per the example you would call it like this,

PairsWithMatchingDifferenceCount(Enumerable.Range(1, 5), 2);

but, if the sequence generation is a simple as the question suggests why not just.

var m = 5;
var n = 2;

var result = Enumerable.Range(n + 1, m - n)
                 .Select(x => Tuple.Create(x, x - n)).Count();

or indeed,

var result = m - n;
Jodrell
  • 34,946
  • 5
  • 87
  • 124