0

I've come across multiple examples of getting duplicates, and removing them from array. How do I do the opposite, to only getting the duplicates and removing the rest of the elements. from what I've learned from these examples: How do I remove duplicates from a C# array?, remove duplicates from two string arrays c#, I came up with a code to do "double" operation.

Code workflow: There is this array numbers of int[10], and another array duplicates of int[n], where n is undeterminable/depends on numbers. From numbers, I first set duplicates to the actual non-duplicates version of numbers using .Distinct().toArray(). Then I essentially have to minus off duplicates from numbers, to get the actual duplicated values. But smh in that process, its stating that my duplicates array is nulled.

Code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ArrayEx4
{
    internal class Program
    {
        static void Display(int[] x)
        {
            for (int i = 0; i < x.Length; i++)
            {
                Console.WriteLine(x[i]);
            }
        }
        static void Main(string[] args)
        {
            int[] numbers = new int[10];
            int[] duplicates = { };
            for (int i = 0; i < numbers.Length; i++)
            {
                Console.WriteLine("Enter number " + (i + 1) + ": ");
                numbers[i] = Convert.ToInt32(Console.ReadLine());
            }
            duplicates = numbers.Distinct().ToArray();
            duplicates = numbers.Except(duplicates).ToArray();
            Console.WriteLine("\n\nProvided data:\n");
            Display(numbers);
            Console.WriteLine("\n\nDuplicates:");
            Display(duplicates);
            Console.WriteLine(Console.ReadLine());
        }
    }
}

Img:

Duplicates containing no elements

What am I doing wrong? Any explanation would be awesome!

UnfreeHeX
  • 83
  • 1
  • 9
  • Do you want to get all the duplicates? eg, if the number 3 occurs 10 times, do you want the output to contain 10 number 3, or will just one 3 do (signifying "3 was present at least 2 times") - or do you want a pair of values like (3,10) signifying "3 was present 10 times"? – Caius Jard Oct 15 '21 at 05:56
  • @CaiusJard just once, for to prove that that number has existed more than once. that should suffice my requirements. – UnfreeHeX Oct 16 '21 at 09:05
  • In that case I'd modify Blindys advice somewhat; I posted it up as an answer – Caius Jard Oct 16 '21 at 09:44

3 Answers3

1

numbers.Distinct() returns every single number in your array, just once. If you exclude these numbers from the array, you exclude everything. Instead you want to get the numbers that are present at least two times.

You can go the lazy (insane memory allocations) route (numbers.GroupBy(w => w).Where(w => w.Count() >= 2).Select(w => w.Key).ToArray()), or the performant way:

List<int> numbers = ...; // input numbers
var seen = new HashSet<int>();
var numbersWithDuplicates = new List<int>();

foreach(var number in numbers)
    if(!seen.Add(number)) // only care about numbers that appear multiple times
        numbersWithDuplicates.Add(number);

var uniques = numbers.Except(numbersWithDuplicates).ToList();
Blindy
  • 65,249
  • 10
  • 91
  • 131
1

Another approach using a Dictionary<int, int>, that tracks how many times each value appears:

Dictionary<int, int> counts = new Dictionary<int, int>();
foreach(int num in numbers)
{
    if (!counts.ContainsKey(num))
    {
        counts.Add(num, 0);
    }
    counts[num]++;
}
duplicates = counts.Where(c => c.Value >= 2).Select(c => c.Key).ToArray();

This would give you the distinct values that were duplicated in array "duplicates".

You could also just use the Dictionary "counts" directly afterwards if you needed to know how many times each value appeared.

Idle_Mind
  • 38,363
  • 3
  • 29
  • 40
1

You mentioned that it will suffice to get a single note of the number that is duplicated


var seen = new HashSet<int>();
var dupe = new HashSet<int>();

foreach(var number in numbers)
    if(!seen.Add(number))
        dupe.Add(number);

Now your dupe hashset contains a bunch of numbers that were all duplicates. It works because a hashset returns false if you try and add a number that it already knows. This means the second time we add a number we get a false if we say "if it didn't add, add it to this other hashset" - we don't care what the add does on the second hash set, and basically the second hashset is a deduped list of all numbers appearing twice or more

If your numbers list was 1,2,2,3,3,3,4,5,6,6,6,6 then dupe contains 2,3,6 - these were all the numbers that appeared more than once

If you want it as an array you can call dupe.ToArray but it's probably fine to leave it as a hashset; you can enumerate it and ask it if it contains some number X etc. For example you might call Display(dupe.ToArray()) or you might modify display like:

    static void Display(IEnumerable<int> x)
    {
        foreach(var y in x)
        {
            Console.WriteLine(y);
        }
    }

This could display either an int[] or a HashSet<int>; C# considers them both to be equivalent to IEnumerable<int>

Display(numbers);
Display(dupe);
Caius Jard
  • 72,509
  • 5
  • 49
  • 80