0

I need to remove all duplicate values from an array of integer, yet maintain the order of the elements:

Example:

10,20,20(duplicate),10(duplicate),50

Becomes:

10,20,50
Jan Doggen
  • 8,799
  • 13
  • 70
  • 144
Alca
  • 81
  • 2
  • 13
  • [How do I delete an element from an array?](http://www.cs.wisc.edu/~rkennedy/array-delete) – Rob Kennedy Dec 25 '15 at 14:19
  • If you need to remove duplicates from an array, you probably need another data structure (except for training tasks). – MBo Dec 26 '15 at 07:13

2 Answers2

3
  1. Create a dictionary with Integer as the key. The value type is immaterial.
  2. Iterate through the input array. For each value in the input array, check whether or not that value is in the dictionary.
  3. If yes, this is a duplicate, discard.
  4. If no, this is the first time the value has been encountered. Retain the value, and add it to the dictionary.

The point of the dictionary is that it can perform O(1) lookup.

In pseudocode:

var
  arr: TArray<Integer>; // input and output
  Dict: TDictionary<Integer, Integer>;
  SrcIndex, DestIndex: Integer;
....
DestIndex := 0;
for SrcIndex := 0 to high(arr) do begin
  Value := arr[SrcIndex];
  if not Dict.ContainsKey(Value) then begin
    arr[DestIndex] := arr[SrcIndex];
    Dict.Add(Value, 0);
    inc(DestIndex);
  end;
end;
SetLength(arr, DestIndex);

Obviously you need to create, and destroy, the dictionary. I'm assuming you know how to do that. And I've opted to modify the array in place but you could equally create a new array if you prefer.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • Thanks, It really help. – Alca Dec 28 '15 at 07:48
  • @Andreas These days I'd use the ordered hash set from the spring4d project. You should check it out! https://bitbucket.org/sglienke/spring4d/src/develop/Source/Base/Collections/Spring.Collections.Sets.pas – David Heffernan Jun 26 '19 at 20:37
0

heres a version without dictionary.

procedure TForm1.RemoveDuplicates;
var
  i,j,k,tot,mov:integer;
  arr:array of integer;
begin
  arr := [10,20,30,40,30,20,10,10,50,10,20,40];
  tot := 0;
  for i := 0 to length(arr)-1 do
  begin
    if i >= length(arr)-tot-1 then
      break;
    for j := i + 1 to length(arr)-1-tot do
    begin
      if j >= length(arr)-tot-1 then
        break;
      mov := 0;
      while arr[i] = arr[j] do
      begin
        inc(mov);
        arr[j] := arr[j+mov];
      end;
      tot := tot + mov;
      if mov>0 then
        for k := j+1 to length(arr)-1-tot do
          arr[k] := arr[k+mov];
    end;
  end;
  SetLength(arr,length(arr)-tot-1);
end;
  • This is a very slow algorithm, especially if there are many duplicate elements. (Of course, that might not matter for small arrays.) – Andreas Rejbrand Jun 26 '19 at 13:38
  • Actually it may be faster for small arrays. instead of loading dictionaries it works with just itself. – Serkan Ekşioğlu Jun 26 '19 at 14:08
  • 1
    That is very true. For small arrays with no or very few duplicates, your code is likely faster than David's. But I think your algorithm has a much worse [time complexity](https://en.wikipedia.org/wiki/Time_complexity) than David's so as the input size grows, your will be very much slower. For huge arrays, it might be the case that only David's is feasible. In addition, removing an element at a time in the middle of an array like you do is very expensive, and completely unnecessary (compare David's approach). – Andreas Rejbrand Jun 26 '19 at 20:34
  • I'd also argue that David's example is way more readable and thus easier to maintain by future developers of the code. – Marcus Mangelsdorf Sep 06 '22 at 06:19